Library of Math
New and Used Math Books at Great Low Prices
Subscribe to the Library of Math Feed

Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning)

Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning)

enlarge enlarge 
Authors: Richard S. Sutton, Andrew G. Barto
Publisher: The MIT Press
Category: Book

List Price: $60.00
Buy New: $39.95
You Save: $20.05 (33%)



New (22) Used (10) from $36.95

Rating: 4.5 out of 5 stars 12 reviews
Sales Rank: 300379

Media: Hardcover
Pages: 322
Number Of Items: 1
Shipping Weight (lbs): 1.8
Dimensions (in): 9.1 x 7.2 x 1.2

ISBN: 0262193981
Dewey Decimal Number: 006.31
EAN: 9780262193986

Publication Date: March 1, 1998
Availability: Usually ships in 1-2 business days

Similar Items:

  • Pattern Recognition and Machine Learning (Information Science and Statistics)
  • Neuro-Dynamic Programming (Optimization and Neural Computation Series, 3)
  • Artificial Intelligence: A Modern Approach (2nd Edition) (Prentice Hall Series in Artificial Intelligence)
  • Machine Learning (Mcgraw-Hill International Edit)
  • Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics)

Editorial Reviews:

Product Description
Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability.

The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.



Customer Reviews:   Read 7 more reviews...

5 out of 5 stars An excellent introduction to Reinforcement Learning   March 1, 2000
David Tan (USA)
16 out of 18 found this review helpful

This is probably one of the best book I have read in the past 1 year.

The authors present the subject with an excellent balance of mathematical, computational and intuitive material. The book also includes plenty of real-life examples to explain the concepts and motivations for the algorithms.

The book starts with examples and intuitive introduction and definition of reinforcement learning. It follows with 3 chapters on the 3 fundamental approaches to reinforcement learning: Dynamic programming, Monte Carlo and Temporal Difference methods. Subsequent chapters build on these methods to generalize to a whole spectrum of solutions and algorithms.

The book is very readable by average computer students. Possibly the only difficult one is chapter 8, which deals with some neural network concepts.

I highly recommend this book to anyone who wants to learn about this subject.


5 out of 5 stars Excellent introduction to reinforcement learning   August 3, 2003
Mihailo Despotovic (Silicon Valley, CA USA)
12 out of 13 found this review helpful

I have this book more than a year now and I am going through it for the second time, so I think I have a pretty good picture about it.

The book consists of three parts. In the first part, "The Problem", the authors define the scope of issues reinfocement learning is dealing with and they give some interesting introductory examples. Then, they move on to the concept of evaluative feedback and, eventually, define the reinforcement learning problem formally.

The second part, "Elementary Solution Methods" consists of three more-less independent subparts: Dynamic Programming, Monte Carlo Methods and Temporal Difference Learning. All three fundamental reinforcement learning methods are presented in an interesting way and using good examples. Personally, I liked the TD-Learning part best and I agree that this method is indeed the central method and an original contribution of reinforecement learning to the field of machine learning.

The third part, "A Unified View" present more advanced techniques. The last chapter gives the most important case studies in reinforcement learning including Samuel's Checkers Player and Thesauro's TD-Gammon.

The book is very readable and every chapter ends with illustrative exercises (many of them actually are real programming projects!), always useful summary and very valuable bibliographical and historical remarks. Some subchapters are more advanced and therefore marked with '*'. I really recommend first two parts to any student ofd computer science or anyone interested in machine learning and fuzzy computing. The third part is much more advanced but it would be definitely interesting for advanced computer scientists and graduate students.

This is still the first edition of the book which means that the material is almost six years old, but it's the third printing, so there is lot of interest and I would suggest (for second edition) that authors include solutions to (at least selected) exercises, something like Knuth did in "The Art of Computer Programming".


5 out of 5 stars Its a nice introductory text on Reinforcement Leaning!   January 7, 1999
gosavi@eng.usf.edu Abhijit Gosavi (Tampa, Florida)
12 out of 14 found this review helpful

The book is easy and interesting to read. The diagrams, especially those on TD, throw a great deal of insight on the basic concept of TD. The intuitive ideas behind RL are developed clearly. At the same time all the fundamental concepts are made mathematically precise using very simple language and notation. Anybody new to RL should find this book extremely useful.


5 out of 5 stars An excellent introduction   November 5, 2004
Dr. Lee D. Carlson (Saint Louis, Missouri USA)
10 out of 14 found this review helpful


As a subfield of artificial intelligence, reinforcement learning has shown great success from both a theoretical and practical viewpoint. Taking the form of numerous applications in finance, network engineering, robot toys, and games, it is clear that his learning paradigm shows even greater promise for future developments. The authors summarize the foundations of reinforcement learning, some of this coming from their own work over the last decade.

The authors define reinforcement learning as learning how to map situations to actions so as to maximize a numerical reward. The machine that is indulging in reinforcement learning discovers on its own which actions will optimize the reward by trying out these actions. It is the ability of such a machine to learn from experience that distinguishes it from one that is indulging in supervised learning, for in the latter examples are needed to guide the machine to the proper concept or knowledge. The authors emphasize the "exploration-exploitation" tradeoffs that reinforcement-learning machines have to deal with as they interact with the environment.

For the authors, a reinforcement learning system consists of a `policy', a `reward function', a `value function', and a `model' of the environment. A policy is a mapping from the states of the environment that are perceived by the machine to the actions that are to be taken by the machine when in those states. The reward function maps each perceived state of the environment to a number (the reward). A value function specifies what is the good for the machine over the long run. A model, as the name implies, is a representation of the behavior of the environment. The authors emphasize that all of the reinforcement learning methods that are discussed in the book are concerned with the estimation of value functions, but they point out that other techniques are available for solving reinforcement learning problems, such as genetic algorithms and simulated annealing.

The authors use dynamic programming, Monte Carlo simulation, and temporal-difference learning to solve the reinforcement learning problem, but they emphasize that each of these methods will not give a free-lunch. An entire chapter is devoted to each of these methods however, giving the reader a good overview of the weaknesses and strengths of each of these approaches. The differences between them usual boil down to issues of performance rather than accuracy in the generated solutions. Temporal difference learning in fact is viewed in the book as a combination of Monte Carlo and dynamic programming techniques, and in the opinion of this reviewer, has resulted in some of the most impressive successes for applications based on reinforcement learning. One of these is TD-Gammon, developed to play backgammon, and which is also discussed in the book.

The authors emphasize that these three main strategies for solving reinforcement learning problems are not mutually exclusive. Instead each of them could be used simultaneously with the others, and they devote a few chapters in the book illustrating how this "unified" approach can be advantageous for reinforcement learning problems. They do this by using explicit algorithms and not just philosophical discussion. These discussions are very interesting and illustrate beautifully the idea that there is no "free lunch" in any of the different algorithms involved in reinforcement learning.

In the last chapter of the book the authors overview some of the more successful applications of reinforcement learning, one of them already mentioned. Another one discussed is the `acrobot', which is a two-link, underactuated robot, which models to some extent the motion of a gymnast on a high bar. The motion of the acrobot is to be controlled by swinging its tip above the first joint, with appropriate rewards given until this goal is reached. The authors use the `Sarsa' learning algorithm, developed earlier in the book, for solving this reinforcement learning problem. The acrobot is an example of the current intense interest in machine learning of physical motion and intelligent control theory.

Another example discussed in this chapter deals with the problem of elevator dispatching, which the authors include as an example of a problem that cannot be dealt with efficiently by dynamic programming. This problem is studied with Q-learning and via the use of a neural network trained by back propagation.

The authors also treat a problem of great importance in the cellular phone industry, namely that of dynamic channel allocation. This problem is formulated as a semi-Markov decision problem, and reinforcement learning techniques were used to minimize the probability of blocking a call. Reinforcement learning has become very important in the communications industry of late, as well as in queuing networks.



5 out of 5 stars From the author of Approximate Dynamic Programming   December 15, 2007
Warren B. Powell (Princeton, NJ USA)
1 out of 1 found this review helpful

Reinforcement Learning is an exceptionally clear introduction to a field that also goes under names such as approximate dynamic programming, adaptive dynamic programming and neuro-dynamic programming. The book is written entirely from the perspective of computer science, where problems tend to have discrete states (although potentially large state spaces) and (typically) small action spaces.

The book provides numerous step-by-step algorithms that makes it relatively easy to get started writing algorithms. The presentation uses minimal mathematics, and avoids the difficult theory supporting the convergence proofs, making it a nice introduction for undergraduates and graduates alike. But throughout the presentation is evidence of extensive experience with applying these methods to a range of classical problems in artificial intelligence.

Students interested in a stronger theoretical foundation should look at Neuro-Dynamic Programming (Optimization and Neural Computation Series, 3). My recent book, Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics), puts far more emphasis on mathematical modeling, and presents the field more from the perspective of the operations research community. For an edited volume with a number of contributions from both artificial intelligence and control theory, see Handbook of Learning and Approximate Dynamic Programming (IEEE Press Series on Computational Intelligence).


Warren Powell
Professor
Operations Research and Financial Engineering
Princeton University


 
about us contact us privacy policy terms of use mision statement lom help
The Library of Math - Online Math Organized by Subject Into Topics. © 2005 - 2008 www.LibraryOfMath.com All rights reserved. math rss