Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

Chapter 12. Regression Diagnostics > 12.4 Identifying Outliers and High-Leverag...

12.4 Identifying Outliers and High-Leverage Observations

The Cook's D plot attempts to identify observations that either have high leverage or are outliers in the response variable. However, if an observation has a large Cook's D value, it is not clear (without further investigation) whether that point has high leverage, is an outlier, or both. A plot that complements the Cook's D plot is a plot of the studentized residuals versus the leverage statistic for each observation.

These residual and leverage statistics were created in the output data set by the GLM procedure in Section 12.2. The program statements in this section are a continuation of the analysis in the preceding sections. For the residual-leverage plot, Belsley, Kuh, and Welsch (1980) suggests adding a horizontal reference line at ±2 to identify observations with large studentized residuals, and a vertical line at 2p/n to identify points with high leverage, where p is the total number of parameters in the model, including the intercept. The following statements create the plot and use the abline module defined in Section 9.4 to add reference lines:


  

You are currently reading a PREVIEW of this book.

                                                                                                                    

Get instant access to over $1 million worth of books and videos.

  

Start a Free 10-Day Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint