Back7.1 Estimating Population Proportions and Determining Sample Sizes
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Chapter 7: Estimating Parameters and Determining Sample Sizes
Introduction to Inferential Statistics
Inferential statistics involves using sample data to make inferences or draw conclusions about population parameters. This chapter focuses on estimating population proportions and means, and determining the appropriate sample size for such estimations.
Estimation: Using sample data to estimate population parameters (such as proportions and means).
Hypothesis Testing: Using sample data to test claims about population parameters (covered in later chapters).
Estimating a Population Proportion
Point Estimate
A point estimate is a single value used to approximate a population parameter. For population proportions, the sample proportion \( \hat{p} \) is the best point estimate of the population proportion \( p \).
Unbiased Estimator: A statistic whose sampling distribution has a mean equal to the population parameter it estimates. \( \hat{p} \) is unbiased for \( p \).
Example: In a survey of 950 students, 53% take online courses. The best point estimate of the proportion of all students who take online courses is 0.53.
Confidence Interval (CI)
A confidence interval is a range of values used to estimate the true value of a population parameter. It consists of two main elements:
Confidence Level: The probability (e.g., 95%) that the CI contains the population parameter.
Margin of Error (E): The maximum likely difference between the sample statistic and the population parameter.
Relationship Between Confidence Level and \( \alpha \)
Confidence Level | \( \alpha \) |
|---|---|
90% | 0.10 |
95% | 0.05 |
99% | 0.01 |
Critical Values
The critical value is the z-score that separates the central area from the tails in the standard normal distribution, corresponding to the desired confidence level.


Confidence Level | \( \alpha \) | Critical Value \( z_{\alpha/2} \) |
|---|---|---|
90% | 0.10 | 1.645 |
95% | 0.05 | 1.96 |
99% | 0.01 | 2.575 |

Margin of Error for Proportions
The margin of error (E) for estimating a population proportion is calculated as:
Where \( \hat{q} = 1 - \hat{p} \), \( n \) is the sample size, and \( z_{\alpha/2} \) is the critical value.
This is known as the Wald confidence interval.
Interpreting Confidence Intervals
Correct interpretation: "We are 95% confident that the interval from 0.405 to 0.455 contains the true population proportion \( p \)." This means that if we repeated the sampling process many times, 95% of the constructed intervals would contain \( p \).
Incorrect: "There is a 95% chance that \( p \) is between 0.405 and 0.455." (The parameter is fixed; the interval is random.)
Incorrect: "95% of sample proportions will fall between 0.405 and 0.455."

Constructing a Confidence Interval for \( p \)
To construct a confidence interval for a population proportion:
Verify requirements: simple random sample, binomial conditions, at least 5 successes and 5 failures.
Find the critical value \( z_{\alpha/2} \).
Calculate the margin of error \( E \).
Compute the interval: \( \hat{p} - E < p < \hat{p} + E \).
Round limits to three significant digits.
Example: Online Courses
Sample size: \( n = 950 \)
Sample proportion: \( \hat{p} = 0.53 \)
Critical value for 95% CI: \( z_{\alpha/2} = 1.96 \)
Margin of error:
Confidence interval: or

Analyzing Polls
Sample should be a simple random sample.
Confidence level and sample size should be reported.
Population size is usually not a factor in reliability; sample size and method are more important.
Finding Point Estimate and Margin of Error from a Confidence Interval
Point estimate:
Margin of error:
Determining Sample Size for Estimating a Proportion
To estimate a population proportion with a specified margin of error and confidence level, the required sample size \( n \) is:
If an estimate of \( \hat{p} \) is known:
If no estimate is known (use maximum variability, \( \hat{p} = 0.5 \)):
Always round up to the next whole number.
Example: Online Purchases
Prior estimate: \( \hat{p} = 0.79 \), \( \hat{q} = 0.21 \), \( E = 0.03 \), \( z_{\alpha/2} = 1.96 \)
No prior estimate:
Alternative Confidence Interval Methods
Coverage Probability
The coverage probability of a confidence interval is the actual proportion of intervals that contain the true parameter. The Wald CI often has lower coverage probability than the nominal confidence level, especially for small samples or proportions near 0 or 1.
Better Performing Confidence Intervals
Plus Four Method: Add 2 to the number of successes and 2 to the number of failures, then use the Wald formula. This improves coverage probability.
Wilson Score Interval: More complex, but coverage probability is closer to the nominal level. Formula:
Clopper-Pearson Method: An exact method based on the binomial distribution; tends to be conservative (actual coverage probability is at least the nominal level).
Summary Table: Confidence Interval Methods
Method | Coverage Probability | Complexity |
|---|---|---|
Wald | Often too low | Simple |
Plus Four | Closer to nominal | Simple |
Wilson Score | Very close to nominal | Moderate |
Clopper-Pearson | At least nominal | Complex |
Note: The Wald interval is useful for teaching, but the plus four and Wilson score intervals are preferred for practical applications due to better coverage properties.