Skip to main content
Ch. 10 - Correlation and Regression
Triola - Elementary Statistics 14th Edition
Triola14th EditionElementary StatisticsISBN: 9780137366446Not the one you use?Change textbook
Chapter 10, Problem 10.3.20

Variation and Prediction Intervals
In Exercises 17–20, find the (a) explained variation, (b) unexplained variation, and (c) indicated prediction interval. In each case, there is sufficient evidence to support a claim of a linear correlation, so it is reasonable to use the regression equation when making predictions.
Weighing Seals with a Camera The table below lists overhead widths (cm) of seals measured from photographs and the weights (kg) of the seals (based on “Mass Estimation of Weddell Seals Using Techniques of Photogrammetry,” by R. Garrott of Montana State University). For the prediction interval, use a 99% confidence level with an overhead width of 9.0 cm.

Verified step by step guidance
1
Step 1: Begin by calculating the regression equation. Use the formula for the least squares regression line: y = mx + b, where m is the slope and b is the y-intercept. To find m, use the formula m = (Σ(xy) - n(x̄)(ȳ)) / (Σ(x²) - n(x̄²)). Then calculate b using b = ȳ - m(x̄).
Step 2: Compute the explained variation. The explained variation is the sum of the squared differences between the predicted values (ŷ) and the mean of the observed values (ȳ). Use the formula Σ(ŷ - ȳ)².
Step 3: Compute the unexplained variation. The unexplained variation is the sum of the squared differences between the observed values (y) and the predicted values (ŷ). Use the formula Σ(y - ŷ)².
Step 4: Calculate the prediction interval for an overhead width of 9.0 cm using the regression equation. First, find the predicted value (ŷ) for x = 9.0 cm. Then, use the formula for the prediction interval: ŷ ± t * √(s² + (s²/n) + ((x - x̄)² / Σ(x² - x̄²))), where t is the critical value from the t-distribution for a 99% confidence level.
Step 5: Interpret the prediction interval. The interval provides a range within which the weight of a seal with an overhead width of 9.0 cm is expected to fall, with 99% confidence.

Verified video answer for a similar problem:

This video solution was recommended by our tutors as helpful for the problem above.
Video duration:
11m
Was this helpful?

Key Concepts

Here are the essential concepts you must grasp in order to answer the question correctly.

Explained Variation

Explained variation refers to the portion of the total variation in the dependent variable (in this case, the weight of seals) that can be attributed to the independent variable (overhead width). It is calculated using the regression model, where the sum of squares due to regression (SSR) indicates how well the model explains the data. A higher explained variation suggests a stronger relationship between the variables.
Recommended video:
Guided course
06:14
Coefficient of Determination

Unexplained Variation

Unexplained variation, also known as residual variation, is the part of the total variation in the dependent variable that cannot be accounted for by the independent variable. It is represented by the sum of squares of the residuals (SSE) in a regression analysis. Understanding unexplained variation is crucial for assessing the accuracy of predictions made by the regression model, as it indicates the degree of error in the model's predictions.
Recommended video:
Guided course
06:14
Coefficient of Determination

Prediction Interval

A prediction interval provides a range of values within which we expect a future observation to fall, given a certain level of confidence (e.g., 99%). It takes into account both the variability of the data and the uncertainty in the regression model. The prediction interval is wider than a confidence interval for the mean response because it includes the additional variability of individual observations, making it essential for making informed predictions.
Recommended video:
Guided course
09:00
Prediction Intervals
Related Practice
Textbook Question

Finding the Best Model

In Exercises 5–16, construct a scatterplot and identify the mathematical model that best fits the given data. Assume that the model is to be used only for the scope of the given data, and consider only linear, quadratic, logarithmic, exponential, and power models.

Dirt Cheap The Cherry Hill Construction company in Branford, CT sells screened topsoil by the “yard,” which is actually a cubic yard. Let the variable x be the length (yd) of each side of a cube of screened topsoil. The table below lists the values of x along with the corresponding cost (dollars).

132
views
Textbook Question

Finding the Best Model

In Exercises 5–16, construct a scatterplot and identify the mathematical model that best fits the given data. Assume that the model is to be used only for the scope of the given data, and consider only linear, quadratic, logarithmic, exponential, and power models.

Stock Market Listed below in order by row are the annual high values of the Dow Jones Industrial Average for each year beginning with 2000. Find the best model and then predict the value for the last year listed. Is the predicted value close to the actual value of 26,828.4?

25
views
Textbook Question

Testing for a Linear Correlation

In Exercises 13–28, construct a scatterplot, and find the value of the linear correlation coefficient r. Also find the P-value or the critical values of r from Table A-6. Use a significance level of α = 0.05. Determine whether there is sufficient evidence to support a claim of a linear correlation between the two variables. (Save your work because the same data sets will be used in Section 10-2 exercises.)

Taxis Using the data from Exercise 15, is there sufficient evidence to support the claim that there is a linear correlation between the distance of the ride and the tip amount? Does it appear that riders base their tips on the distance of the ride?

190
views
Textbook Question

Interpreting a Computer Display

In Exercises 5–8, we want to consider the correlation between heights of fathers and mothers and the heights of their sons. Refer to the StatCrunch display and answer the given questions or identify the indicated items. The display is based on Data Set 10 “Family Heights” in Appendix B. (The response y variable represents heights of sons.)

[IMAGE]


Height of Son Should the multiple regression equation be used for predicting the height of a son based on the height of his father and mother? Why or why not?

169
views
Textbook Question

Notation The author conducted an experiment in which the height of each student was measured in centimeters and those heights were matched with the same students’ scores on the first statistics test. If we find that r = 0, does that indicate that there is no association between those two variables?

269
views
Textbook Question

Testing for a Linear Correlation

In Exercises 13–28, construct a scatterplot, and find the value of the linear correlation coefficient r. Also find the P-value or the critical values of r from Table A-6. Use a significance level of α = 0.05. Determine whether there is sufficient evidence to support a claim of a linear correlation between the two variables. (Save your work because the same data sets will be used in Section 10-2 exercises.)

Powerball Jackpots and Tickets Sold Listed below are the same data from Table 10-1 in the Chapter Problem, but an additional pair of values has been added from actual Powerball results. Is there sufficient evidence to conclude that there is a linear correlation between lottery jackpots and numbers of tickets sold? Comment on the effect of the added pair of values in the last column. Compare the results to those obtained in Example 4.

153
views