The Elements of Statistical Learning | 
enlarge | Authors: T. Hastie, R. Tibshirani, J. H. Friedman Publisher: Springer Category: Book
List Price: $94.00 Buy New: $54.95 You Save: $39.05 (42%)
New (36) Used (16) from $54.95
Rating: 27 reviews Sales Rank: 38596
Media: Hardcover Edition: Corrected Pages: 552 Number Of Items: 1 Shipping Weight (lbs): 2.3 Dimensions (in): 9.4 x 6.1 x 1.2
ISBN: 0387952845 Dewey Decimal Number: 006.31 EAN: 9780387952840
Publication Date: July 30, 2003 Availability: Usually ships in 1-2 business days Shipping: Expedited shipping available Condition: Ships next business day from NY
| |
| Accessories:
|
| Similar Items:
|
| Editorial Reviews:
Product Description
During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for ``wide'' data (p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.
|
| Customer Reviews: Read 22 more reviews...
data mining through the eyes of statisticians October 1, 2001 Michael R. Chernick (Malvern, PA) 140 out of 152 found this review helpful
Data mining is a field developed by computer scientists but many of its crucial elements are imbedded in important and subtle statistical concepts. Statisticians can play an important role in the development of this field but as was the case with artificial intelligence, expert systems and neural networks the statistical research community has been slow to respond. Hastie, Tibshirani and Friedman are changing this. Friedman has been a major player in pattern recognition of high dimensional data, in tree classification, regularized discriminant analysis and multivariate adaptive regression splines. He has also done some exciting new research on boosting methods. Hastie and Tibshirani invented additive models which are very general types of regression models. Tibshirani invented the lasso method and is a leader among the researchers on bootstrap. Hastie invented principal curves and surfaces. These tools and the expertise of these authors make them naturals to contribute to advances in data mining. They come with great expertise and see data mining from the statistical perspective. They see it as part of a more general process of statistical learning from data. The book is well written and illustrated with many pretty color graphs and figures. Color adds a dimension in pattern recognition and the authors exploit it in this book. It is really the first of its kind that treats data mining from a statistical perspective and is so comprehensive and up-to-date. The important statistical tools that are covered in this book include under the category of supervised learning; regression, discriminant analysis, kernel methods, model assessment and selection, bootstrapping, maximum likelihood and Bayesian inference, additive models, classification and regression trees, multivariate adaptive regression splines, boosting, regularization methods, nearest neighbor classification, k means clustering algorithms and neural networks. These methods are illustrated using real problems. Similarly under the category of unsupervised learning, clustering and association are covered. They cover the latest developments in principal components and principal curves, multidimensional scaling, factor analysis and projection pursuit. This book is innovative and fresh. It is an important contribution that will become a classic. The level is between intermediate and advanced. Good for an advanced special topics course for graduate students in statistics. The only comparable text is the text by Mannila, Hand and Smyth that I hope to be able to review in the near future.
Useful book on data mining February 6, 2002 frank lindemann 80 out of 86 found this review helpful
I use data mining tools in my financial engineering and financial modeling work and I have found this book to be very useful. This book provides two crucial types of information. First, it provides enough theory to allow a potential user to understand the essential insights that motivate specific techniques and to evaluate the situations in which those technique are appropriate. Second, the book gives the exact algorithms to implement the various techniques. While no book I have seen covers every data mining methodology available, this one has the strongest coverage I have seen in additive models, non-linear regression, and CART/MART (regression/classification trees). It also has very strong coverage in many other areas. I highly recommend it.
data mining from the viewpoint of statisticians January 24, 2008 Michael R. Chernick (Holland PA) 22 out of 22 found this review helpful
Data mining is a field developed by computer scientists but many of its crucial elements are imbedded in important and subtle statistical concepts. Statisticians can play an important role in the development of this field but as was the case with artificial intelligence, expert systems and neural networks the statistical research community has been slow to respond. Hastie, Tibshirani and Friedman are changing this. Friedman has been a major player in pattern recognition of high dimensional data, in tree classification, regularized discriminant analysis and multivariate adaptive regression splines. He has also done some exciting new research on boosting methods. Hastie and Tibshirani invented additive models which are very general types of regression models. Tibshirani invented the lasso method and is a leader among the researchers on bootstrap. Hastie invented principal curves and surfaces. These tools and the expertise of these authors make them naturals to contribute to advances in data mining. They come with great expertise and see data mining from the statistical perspective. They see it as part of a more general process of statistical learning from data. The book is well written and illustrated with many pretty color graphs and figures. Color adds a dimension in pattern recognition and the authors exploit it in this book. It is really the first of its kind that treats data mining from a statistical perspective and is so comprehensive and up-to-date. The important statistical tools that are covered in this book include under the category of supervised learning; regression, discriminant analysis, kernel methods, model assessment and selection, bootstrapping, maximum likelihood and Bayesian inference, additive models, classification and regression trees, multivariate adaptive regression splines, boosting, regularization methods, nearest neighbor classification, k means clustering algorithms and neural networks. These methods are illustrated using real problems. Similarly under the category of unsupervised learning, clustering and association are covered. They cover the latest developments in principal components and principal curves, multidimensional scaling, factor analysis and projection pursuit. This book is innovative and fresh. It is an important contribution that will become a classic. The level is between intermediate and advanced. Good for an advanced special topics course for graduate students in statistics. A comparable text is the text by Mannila, Hand and Smyth. This book made effective use of color and maintained a competitive price. This had a major impact on publishers like Wiley that could not sell a book at this size and initial price. Wiley is still looking for a book comparable to this one that they can use to compete with Springer-Verlag. I know this information because I heard from the Wiley acquisitions editor that I worked with on my two books.
Counter to review from Sep 8 September 11, 2003 Dr. Thomas Lengauer (Germany) 19 out of 21 found this review helpful
The review from September 8 expresses an opinion which is the exact opposite of mine, and is worded so strongly that I have to object. I gave a course using the book to bioinformaticians, most of them with a computer science background, and found the book exceptionally well prepared and suitable for a graduate course. The book serves the dual purpose of an introduction and a reference. An especially nice feature is how the authors explain the relationships and differences between different methods. By doing so, they provide context which I have not seen in any other book on this subject. The book is a very nice combination of basic theory and performance evaluation on data from a wide variety of domains and it is quite up-to-date. It has a well developed website going with it and the graphical material can be obtained electronically from the publisher. The book is an outstanding contribution to the field.
Must read for practicing statisticians. September 21, 2005 Robert Long (Richmond, VA USA) 8 out of 10 found this review helpful
These guys have made a great contribution to the statistical literature. It is a broad book that entends to summarize the latest methods available for data analysis. The authors succeed in giving a statistical context with which to compare and contrast many statistical methods. Some of the statistical methods discussed were developed in the past 5-15 years (SVM, boosting, LASSO, etc...) and haven't yet been put into a broader context. While this book is not comprehensive in its treatment, it is the best single book on data analysis available.
|
|
|