Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

Share this Page URL

1.1 Problems with Assuming Normality > 1.1 Problems with Assuming Normality - Pg. 4

4 Introduction to Robust Estimation and Hypothesis Testing known, if sampling is from the contaminated normal, the length of the standard confidence interval for the population mean, µ, will be over three times longer than it would be when sampling from the standard normal distribution instead. What is important from a practical point of view is that there are location estimators other than the sample mean that have standard errors that are substantially less affected by heavy tailed distributions. By "measure of location," it is meant that some measure intended to represent the typical participant or object, the two best-known examples being the mean and the median. (A more formal definition is given in Chapter 2.) Some of these measures have relatively short confidence intervals when distributions have a heavy tail, yet the length of the confidence interval remains reasonably short when sampling from a normal distribution instead. Put another way, there are methods for testing hypotheses that have good power under normality, but that continue to have good power when distributions are nonnormal, in contrast to methods based on means. For example, when sampling from the contaminated normal given by Eq. (1.2), both Welch's and Student's method for comparing the means of two independent groups have power approximately 0.278 when testing at the 0.05 level with equal sample sizes of 25 and when the difference between the means is 1. In contrast, several other methods, described in Chapter 5, have power exceeding 0.7. In an attempt to salvage the sample mean, it might be argued that in some sense the