Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.


Share this Page URL
Help

17 Metric Predicted Variable with Multip... > 17.1 Multiple Linear Regression - Pg. 454

454 CH A P T E R 17: Metric Predicted Variable with Multiple Metric Predictors the value to be predicted is on a metric scale, and there is more than one predictor, each of which is also on a metric scale. We will consider models in which the predicted variable is an additive combination of predictors, all of which have proportional influence on the prediction. This kind of model is called multiple linear regression, and it is listed in Table 14.1 (p. 385) in the first row and third column. We will also consider nonadditive combinations of predictors, which are called interactions. 17.1 MULTIPLE LINEAR REGRESSION Figures 17.1 and 17.2 show examples of data generated by a model for multiple linear regression. The model specifies the dependency of y on x 1 , x 2 , but it does not specify the distribution of x 1 , x 2 . At any position, x 1 , x 2 , the values of y are normally distributed in a vertical direction, centered on the height of the plane at that position. The height of the plane is a linear combination of the x 1 , x 2 values. Formally, y N(µ, ), and µ = 0 + 1 x 1 + 2 x 2 . For a review of how to interpret the coefficients, 0 , 1 , and 2 , see Figure 14.2, p. 365. The model assumes homogeneity of variance: At all values of x 1 , x 2 , the variance of y is the same. 17.1.1 The Perils of Correlated Predictors Figures 17.1 and 17.2 show data generated from the same model. All that differs between them is the distribution of x 1 , x 2 , which is not specified by the model. In Figure 17.1, the x 1 , x 2 values are distributed uniformly. In Figure 17.2, the x 1 , x 2 values are negatively correlated: When x 1 is small, x 2 tends to be large, and when x 1 is large, x 2 tends to be small. The correlation of x 1 and x 2 can lead to misinterpretations of their individual influences on y. For instance, notice in Figure 17.2 that when x 1 is near zero, then the data y values are near 30, but when x 1 is near 10, then the data y values are near 20. This observation that y declines from 30 to 20 might leave the impression that an increase in x 1 predicts a decrease in y. But such an impression is wrong, because the data were generated by a function that increases y as x 1 increases (i.e., the coefficient 1 on x 1 is +1). The reason that the y values appear to decline as x 1 increases is that x 2 decreases when x 1 decreases, and x 2 has an even bigger influence on y than x 1 does. It is not unusual for predictors to be correlated in real data. For example, con- sider trying to predict a state's average high school SAT score on the basis of the amount of money the state spends per pupil. If you plot only mean SAT against money spent, there is actually a decreasing trend, as can be seen in the lower-left panel of Figure 17.3 (data from Guber, 1999). In other words, SAT scores tend to go down as spending goes up. Guber (1999) explains how some political commentators have used this sort of evidence to argue against funding public education.