Table of Contents#### Download Safari Books Online apps: Apple iOS | Android | BlackBerry

Entire Site

Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

426 CH A P T E R 16: Metric Predicted Variable with One Metric Predictor 16.1.1.2 Initializing the Chains Figure 16.4 showed an example of how the believable values in a posterior distribution can occupy a fairly narrow region of parameter space. For the MCMC chain to randomly sample from the posterior, the random walk must first get into the modal region of the posterior in the first place. We might sim- ply start the chain at any point in parameter space, randomly selected from the prior distribution, and wait through the burn-in period until the chain randomly wanders into the bulk of the posterior. Unfortunately, for many real- world situations, this burn-in period can be a very long time. Therefore, it helps to initialize the chains near the bulk of the posterior if we can. If the data set is large and dominates the prior, or if the prior is diffuse, or if the prior is informed from previous results and is reasonably consistent with the new data, then the peak of the likelihood function will be reasonably near the peak of the posterior distribution. Therefore, a reasonable candidate for the initial point of the chain is the maximum-likelihood estimate of the parameters. This heuristic is only useful if we have a simple way of determining the maximum-likelihood estimate. Fortunately, for simple linear regression, we do. When the data are standardized, the maximum-likelihood estimate (MLE) of 0 is zero, the MLE of 1 is the correlation (denoted r) of x and y, and the MLE of the precision, , is 1 1 - r 2 . To get an intuition for the statement about precision, consider what happens when the correlation r approaches 1 (its maximum possible value). As r approaches 1, the x, y data fall very close to a straight line, which implies that the deviation of the data away from the line is very small (recall Figure 16.1). Hence when r is large, is small, and hence is large, as reflected by the formula 1 1 - r 2 . 16.1.2 The Posterior: How Big Is the Slope? Figure 16.5 shows the posterior distribution of slope values. The standardized and original-scale slopes indicate the same relationship on different scales, and therefore the posterior distributions are identical except for a change of scale. The posterior distribution tells us exactly what we want to know: the believable slopes. We see that weight increases by about 5 or 6 pounds for every 1-inch increase in height. The 95% HDI provides a useful summary of the range of believable slopes. If we were interested in determining whether the predictor had a nonzero influ- ence on the predicted variable, we might use the decision rules discussed in Section 12.1.3 (p. 301). We may want to establish a ROPE around zero for the predictor and then check whether the entire 95% HDI excludes the ROPE. (Usually, if the true value is zero, the HDI will overlap the ROPE, thereby reducing false alarms.) It would be possible to "test" whether the slope is nonzero by doing a Bayesian comparison of two models: One model would have an arbitrarily diffuse prior