Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.
74 CHAPTER 5 Is There a Statistical Difference between Designs? Fortunately, even for small sample sizes (less than 30), the t-test generates reliable results when the data are not normally distributed. For example, Box (1953) showed that a typical amount of error is a manageable 2%. For example, if you generate a p-value of 0.02, the long-term actual probability might be 0.04. This is especially the case when the sample sizes in both groups are equal, so, if possible, you should plan for equal sample sizes in each group, even though you might end up with uneven sample sizes. Equality of Variances The third assumption is that the variances (and equivalently the standard deviations) are approxi- mately equal in both groups. As a general rule, you should only be concerned about unequal var- iances when the ratio between the two standard deviations is greater than 2 (e.g., a standard deviation of 4 in one sample and 12 in the other is a ratio of 3) (Agresti and Franklin, 2007). The robustness of the two-sample t-test also extends to violations of this assumption, especially when the sample sizes are roughly equal (Agresti and Franklin, 2007; Box, 1953; Howell, 2002). For a method of adjusting degrees of freedom to help compensate for unequal variances, see the sidebar "Degrees of Freedom for the Two-sample t-test." Don't Worry Too Much about Violating Assumptions (Except Representativeness) Now that we've covered the assumptions for the two-sample t-test, we want to reassure you that