Skip to main content
Ch. 9 - Correlation and Regression
Larson - Elementary Statistics: Picturing the World 8th Edition
Larson8th EditionElementary Statistics: Picturing the WorldISBN: 9780137493470Not the one you use?Change textbook
Chapter 9, Problem 9.2.6

6. Why is it not appropriate to use a regression line to predict y-values for x-values that are not in (or close to) the range of x-values found in the data?

Verified step by step guidance
1
Understand the concept of extrapolation: Using a regression line to predict y-values for x-values outside the observed data range is called extrapolation.
Recognize that the regression line is based on the relationship observed within the range of the data, so predictions outside this range assume the same pattern continues, which may not be true.
Consider that the behavior of the variables outside the observed range can be different due to factors not captured in the data, leading to unreliable or misleading predictions.
Note that the uncertainty of predictions increases as you move further away from the range of observed x-values, making the regression line less trustworthy for those points.
Therefore, it is generally inappropriate to use the regression line for x-values far from the data range because the model's assumptions and accuracy are not guaranteed beyond the observed data.

Verified video answer for a similar problem:

This video solution was recommended by our tutors as helpful for the problem above.
Video duration:
1m
Was this helpful?

Key Concepts

Here are the essential concepts you must grasp in order to answer the question correctly.

Extrapolation

Extrapolation involves using a regression model to predict values outside the range of observed data. It is risky because the relationship established within the data range may not hold beyond it, leading to unreliable or misleading predictions.
Recommended video:
Guided course
04:57
Using Regression Lines to Predict Values

Range of Data (Domain of Predictor Variable)

The range of data refers to the span of x-values used to fit the regression line. Predictions are most reliable within this range since the model is based on observed patterns; values far outside this range lack supporting data and increase uncertainty.
Recommended video:
Guided course
04:14
Expected Value (Mean) of Random Variables

Assumptions of Linear Regression

Linear regression assumes a consistent linear relationship between variables within the data range. When predicting outside this range, these assumptions may fail, as the relationship could change, making the model's predictions invalid or inaccurate.
Recommended video:
Guided course
07:01
Intro to Least Squares Regression