This open access book presents robust statistical methods and procedures through the monitoring approach, with an emphasis on applications to linear regression. Illustrating the theory, it explores both large and small-sample properties. The performance of the forward search and of the monitoring of static robust estimators for regression data are illuminated through numerous data analyses using MATLAB and R.
The book describes the results of many years’ work of the authors in the development of powerful methods of robust regression analysis. Robust methods are designed to analyse contaminated data. The well-established static robust methods estimate model features, such as parameter estimates, assuming the amount of contamination in the data is known. These methods are described in detail in Chapter 2 for estimation in a simple sample. The extension to regression is presented in Chapter 3, with an emphasis on S-estimation and related procedures as well as on least trimmed squares. The monitoring methods of Chapter 4, including the forward search, find the appropriate level of robustness for each data set and so avoid biased estimation from the inclusion of outliers and inefficiency due to the deletion of uncontaminated observations. This analysis is followed by examples which illustrate the use of the interactive graphical analyses associated with the authors’ FSDA toolbox. Numerical comparisons of the size and power of outlier tests appear in Chapter 5. Later chapters illustrate applications to response transformation in regression and to non-parametric regression. Extensions of the robust multiple regression model include Bayesian, heteroskedastic, time series and compositional regression, together with the clustering of regression models. Finally, several approaches to model selection are investigated and robust analyses of regression data are presented that illustrate the use of the techniques introduced earlier.
Exercises are given at the end of each chapter, with solutions at the end of the book. The MATLAB code can be reproduced using MATLAB Online, without the need for a license, or via the language-agnostic Jupyter notebook environment, after installing the MATLAB kernel. Online computer code is available for all examples and exercises, together with a series of YouTube videos.
Aimed at professional statisticians and researchers concerned with insightful data analysis, as well as postgraduate students, the book may also serve as a text for a modern interactive robust regression course.