Regression Modeling Strategies | 
enlarge | Author: Frank E. Jr. Harrell Publisher: Springer Category: Book
List Price: $109.00 Buy New: $75.00 You Save: $34.00 (31%)
New (20) Used (13) from $72.00
Rating: 8 reviews Sales Rank: 466903
Media: Hardcover Edition: Corrected Pages: 600 Number Of Items: 1 Shipping Weight (lbs): 2.5 Dimensions (in): 9.4 x 7.1 x 1.3
ISBN: 0387952322 Dewey Decimal Number: 519.536 EAN: 9780387952321
Publication Date: January 10, 2001 Availability: Usually ships in 1-2 business days
| |
| Accessories:
|
| Similar Items:
|
| Editorial Reviews:
Product Description There are many books that are excellent sources of knowledge about individual stastical tools (survival models, general linear models, etc.), but the art of data analysis is about choosing and using multiple tools. In the words of Chatfield "...students typically know the technical details of regressin for example, but not necessarily when and how to apply it. This argues the need for a better balance in the literature and in statistical teaching between techniques and problem solving strategies." Whether analyzing risk factors, adjusting for biases in observational studies, or developing predictive models, there are common problems that few regression texts address. For example, there are missing data in the majority of datasets one is likely to encounter (other than those used in textbooks!) but most regression texts do not include methods for dealing with such data effectively, and texts on missing data do not cover regression modeling.
|
| Customer Reviews: Read 3 more reviews...
advanced topics in regression with emphasis on model selection January 24, 2008 Michael R. Chernick (Holland PA) 26 out of 26 found this review helpful
Frank Harrell is a Professor who does a lot of consulting in medical research. This book covers a wide variety of topics in regression analysis including many advanced techniques including data reduction, smoothing techniques, variable selection, transformations, shrinkage methods, tree-based methods and resampling. But note the title "Regression Modeling Strategies". Unlike most advanced texts in regression this book emphasizes modeling strategies. So the focus is on things like variable selection and other techniques to avoid overfitting models and diagnostics to look for violations in assumptions such as variance homogeneity or normality and independence of residuals, or stability problems like colinearity. The book covers an extensive collection of modern techniques for exploratory data analysis. Inferential methods are also considered and he deals appropriately with important issues (particularly for medical research) such as imputation of missing values. Many examples are considered and illustrated in S-PLUS. Harrell also provides many rules of thumb based on his own experience building models. A lot of the techniques are illustrated using data from the Titanic where it is interesting to see which factors affected the probability of survival. My only disappointment was that there is perhaps too much emphasis on this one particular data set. A standard regression text would be expected to include linear and nonlinear regression. Harrell goes much deeper including nonparametric regression, logistic regression and survival models (e.g. the Cox proportional hazards model).
Published Review by Margaret May May 26, 2004 bibbub (Los Angeles, CA) 9 out of 9 found this review helpful
Though it can be treated as an advanced book in statistics, empirical researchers can find tremendous value in this book just by following the steps and visualize your data. It's very useful for fitting and validation in prognostic models, and it emphasize on use of bootstrap. Just flipping this book, you will feel silly to pre-specify a multivariate regression model (using Proc reg, logistic, phreg) without checking the interactions and nonlinear terms or use simple model fitting approach such as stepwise selection, becasue data in the real world are no means "linear" and free of interaction. All the functions are written in S-Plus (or R) and I cannot resist the temptation of those beautiful and highly informational graphs. As a result, I am converted to be a S-Plus user now after being a SAS user for years, following the steps of Dr. Harrell (a SAS user from 1969-1991).This review is published on International Journal of Epidemiology 2002;31:699-700 (I am a quoter, not the author). Most statistical textbooks present techniques and give simple examples of their use. This book is different. It assumes you already have the basic tools of linear and logistic regression, parametric and semi-parametric survival analysis in your well-stocked statistical tool box which you acquired in graduate school. The question this book addresses is how do you use those regression tools properly. The book succeeds in being both philosophical and intensely practical in nature. It is about the art of data analysis and modelling strategies. It takes you through the whole process starting with imputation of missing data, leading you through dealing with non-linear relationships, estimating transformations, variable selection, model building and finally validation of the model using powerful bootstrap techniques. Harrell has a unifying approach to regression modelling strategies in that he emphasises how the methods he presents may be used across many different types of regression model in a variety of subject areas, although his examples are biomedical. One of the main points of the book is that there is a dishonesty that is widespread in that we treat inference from P-values, confidence intervals and statistics as if the data were not used to build the model. We need to recognise that it is usually not possible to pre-specify a multivariable regression model, for example, whether a survival model should be a Weibull or a lognormal model, what transformations of variables are appropriate, inclusion of non-linear terms and interaction terms and so on. However, statistics are often computed as if the data were not used to make decisions about the form of the model and how predictors are represented in the model. This means that models over fit the data on which they are estimated and poorly predict responses of future observations. Great emphasis is placed on addressing this fundamental problem of the modelling process. In particular, the author strongly recommends using bootstrap methods in many steps of the modelling strategy, including variable selection, derivation of distribution-free confidence intervals and estimation of optimism in model fit. For example, there has been much criticism of stepwise variable selection, but Harrell uses this procedure with bootstrapping and shows that variation in bootstrapped samples of the same dataset will lead to selection of different sets of variables and that a better strategy is to use the set of variables which occurs most frequently in the bootstrapped samples. This will give a more reliable and useful set of prognostic factors in the model which will predict responses from new data with greater precision and accuracy. There are detailed case studies of real examples which are analysed using S-Plus with the code being explicitly given. The web site of the book gives access to the datasets and an S-Plus library with 200 functions for model fitting and testing, estimation, validation, prediction, graphics and typesetting. The book is particularly strong on graphical presentation of the regression models and claims that a picture will often persuade a non-statistician of the necessity for a particular transformation of a predictor rather than to opt for a simple linear term which does not fit the data so well. In particular, cubic splines and non-parametric smoothers are recommended early on as a way of relaxing linear assumptions and are used throughout the case studies. This is an excellent book for its target audience, postgraduates who know the technical details of regression models, but not necessarily when and how to use them. It is also a worthwhile addition to the reference shelf of data analysts and statistical methodologists who will appreciate the many recipes given for successful modelling strategies and tips on validation when the data have been used to inform the modelling process.
Outstanding graduate text April 28, 2003 8 out of 8 found this review helpful
This text does a five star job of what the title advertises. The book could be used for a one year graduate course in applied linear models. The writing is excellent, and topics very up to date. This is for graduate students with a good foundation in mathematical statistics and applied statistics. Very good integration with modern statistical packages.
A great book for anyone who wants to do regression May 31, 2005 Peter Flom (New York City) 6 out of 6 found this review helpful
This is a great book. Although it is not as easy to understand as some other books on regression, I feel that anyone who doesn't understand the ideas put forth here is not really fully competent to do regression analysis. The book is not at fault - statistics is not a simple subject, and regression is not a simple subject, either. The audience for this book is NOT theoretical statisticians, it is applied statisticians with some background in regression. As a social scientist/statistician, my only complaint is that nearly all the examples are medical - but that's a minor point.
nice coverage of advanced topics with emphasis on modeling September 25, 2001 Michael R. Chernick (Malvern, PA) 51 out of 51 found this review helpful
Frank Harrell is a Professor who does a lot of consulting in medical research. This book covers a wide variety of topics in regression analysis including many advanced techniques including data reduction, smoothing techniques, variable selection, transformations, shrinkage methods, tree-based methods and resampling. But note the title "Regression Modeling Strategies". Unlike most advanced texts in regression this book emphasizes modeling strategies. So the focus is on things like variable selection and other techniques to avoid overfitting models and diagnostics to look for violations in assumptions such as variance homogeneity or normality and independence of residuals, or stability problems like colinearity.The book covers an extensive collection of modern techniques for exploratory data analysis. Inferential methods are also considered and he deals appropriately with important issues (particularly for medical research) such as imputation of missing values. Many examples are considered and illustrated in S-PLUS. Harrell also provides many rules of thumb based on his own experience building models. A lot of the techniques are illustrated using data from the Titanic where it is interesting to see which factors affected the probability of survival. My only disappointment was that there is perhaps too much emphasis on this one particular data set. A standard regression text would be expected to include linear and nonlinear regression. Harrell goes much deeper including nonparametric regression, logistic regression and survival models (e.g. the Cox proportional hazards model).
|
|
|