Table of Contents#### Download Safari Books Online apps: Apple iOS | Android | BlackBerry

Entire Site

Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

68 CHAPTER 5 Is There a Statistical Difference between Designs? The mean difference score is -77 seconds and the standard deviation of the difference scores is 61 seconds. Plugging these values in the formula we get ^ D t = s p D ffiffiffi n t = -77 61 p ffiffiffiffiffiffi 21 t = -5:78 We have a test statistic (t ) equal to -5.78 with 20 (n - 1) degrees of freedom and the decision prior to run- ning the study to conduct a two-sided test. To determine whether this is significant we need to look up the p-value using a t-table, the Excel function =TDIST(), or the calculator available at http://www.usablestats .com/calcs/tdist. Using =TDIST(5.78,20,2), we find p = 0.00001, so there is strong evidence to conclude that users take less time to complete an expense report on Product A. If you follow the steps from the previous example, you'll find that the 95% confidence interval for this difference ranged from about 49104 seconds-- a difference that users are likely to notice. In this example, the test statistic is negative because we subtracted the typically longer task time (from Product B) from the shorter task time (Product A). We would get the same p-value if we subtracted the smaller time from the larger time, changing the sign of the test statistic. When using the Excel TDIST function, keep in mind that it only works with positive values of t. Normality Assumption of the Paired t-test As we've seen with the paired t-test formula, the computations are performed on the difference scores. We therefore are only working with one sample of data, which means the paired t-test is really just the one-sample t-test from Chapter 4 with a different name. The paired t-test therefore has the same normality assumption as the one-sample t-test. For large sample sizes (above 30), normality isn't a concern because the sampling distribution of the mean is normally distributed (see Chapter 9). For smaller sample sizes (less than 30) and for two-tailed tests, the one-sample t-test/paired t-test is considered robust against violations of the normality assumption. That is, data can be non-normal (as with task-time data) but still generate accurate p-values (Box, 1953). For this reason, we recommend sticking with the two-sided test when using the paired t-test. Between-subjects Comparison (Two-sample t-test) When a different set of users is tested on each product there is variation both between users and between designs. Any difference between the means (e.g., questionnaire data, task times) must be tested to see whether it is greater than the variation between the different users.