### 12.4 Identifying Outliers and High-Leverage Observations

The Cook's D plot attempts to identify observations that either have high leverage or are outliers in the response variable. However, if an observation has a large Cook's D value, it is not clear (without further investigation) whether that point has high leverage, is an outlier, or both. A plot that complements the Cook's D plot is a plot of the studentized residuals versus the leverage statistic for each observation.

These residual and leverage statistics were created in the output data set by the GLM procedure in Section 12.2. The program statements in this section are a continuation of the analysis in the preceding sections. For the residual-leverage plot, Belsley, Kuh, and Welsch (1980) suggests adding a horizontal reference line at ±2 to identify observations with large studentized residuals, and a vertical line at 2p/n to identify points with high leverage, where p is the total number of parameters in the model, including the intercept. The following statements create the plot and use the abline module defined in Section 9.4 to add reference lines: