BackEstimating Population Proportions and Means: Confidence Intervals and Sample Size Determination
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Estimating Population Proportions
Point Estimates and Proportions
When estimating a population proportion, we use the sample proportion as our best point estimate. However, this estimate is not guaranteed to be exactly equal to the true population proportion.
Population Proportion (p): The true proportion of individuals in the population with a certain characteristic.
Sample Proportion (\( \hat{p} \)): The proportion observed in the sample, calculated as \( \hat{p} = \frac{x}{n} \), where x is the number of successes and n is the sample size.
Point Estimate: A single value used to approximate a population parameter. For proportions, \( \hat{p} \) is the best point estimate of p.
Example: In a survey of 1007 adults, 85% knew what Twitter is. Here, \( \hat{p} = 0.85 \).
Confidence Intervals for Population Proportions
A confidence interval provides a range of values within which the population proportion is likely to lie, with a specified level of confidence.
Confidence Interval: An interval estimate for a population parameter, expressed as \( \hat{p} \pm E \), where E is the margin of error.
Margin of Error (E): The maximum likely difference between the point estimate and the true population parameter.
Critical Value (z\( \alpha/2 \)): The z-score that separates an area of \( \alpha/2 \) in the right tail of the standard normal distribution.
Confidence Interval Formula:
Where the margin of error is:
Alternatively, if the confidence interval limits are known:
Point Estimate:
Margin of Error:
Requirements for Constructing a Confidence Interval for Proportion
The sample is a simple random sample.
The binomial distribution conditions are satisfied: fixed number of trials, independent trials, two possible outcomes, constant probability.
At least 5 successes and 5 failures in the sample.
Finding Critical Values
To find the critical value \( z_{\alpha/2} \) for a given confidence level, use the standard normal (z) table.
For a 95% confidence level, \( z_{\alpha/2} = 1.96 \)
For a 99% confidence level, \( z_{\alpha/2} = 2.575 \)
For other levels, calculate \( \alpha = 1 - \text{confidence level} \), then find the z-score with area \( 1 - \alpha/2 \) to the left.
Example: Pew Research Center Poll
Sample size: n = 1007
Sample proportion: \( \hat{p} = 0.85 \)
Confidence level: 95% (\( z_{\alpha/2} = 1.96 \))
Margin of error:
Confidence interval:
Interpreting Confidence Intervals
If the entire confidence interval is above a certain value (e.g., 0.75), we can be confident the population proportion exceeds that value.
When reporting results, include the confidence level, interval, and context.
Sample Size Determination for Proportions
To determine the required sample size for estimating a population proportion with a specified margin of error:
If no prior estimate for \( \hat{p} \) is available, use 0.5 for maximum variability:
Example: To be 95% confident (\( z_{\alpha/2} = 1.96 \)) that the sample percentage is within 3 percentage points (E = 0.03) of the true percentage, and \( \hat{p} = 0.66 \):
Considerations When Analyzing Polls
Sample must be random.
Confidence level and sample size should be reported.
Population size is usually not a factor in determining reliability.
Do not assume small sample percentages relative to population size are unreliable.
Estimating Population Means
Point Estimates and Means
The sample mean is the best point estimate of the population mean (\( \mu \)).
Population Mean (\( \mu \)): The average value in the population.
Sample Mean (\( \bar{x} \)): The average value in the sample.
Confidence Intervals for Population Means
Confidence intervals for means depend on whether the population standard deviation (\( \sigma \)) is known.
Case 1: \( \sigma \) Not Known (Use t-distribution)
Use the t-distribution with degrees of freedom \( df = n - 1 \).
Critical value: \( t_{\alpha/2} \)
Margin of error:
Confidence interval:
Requirements: Population is normally distributed or sample size > 30.
Case 2: \( \sigma \) Known (Use z-distribution)
Use the standard normal (z) distribution.
Critical value: \( z_{\alpha/2} \)
Margin of error:
Confidence interval:
Requirements: Sample is random, \( \sigma \) is known, and population is normal or n > 30.
Finding Point Estimate and Margin of Error from Confidence Interval
Point Estimate:
Margin of Error:
Choosing the Appropriate Distribution
If \( \sigma \) is not known, use the t-distribution (TInterval).
If \( \sigma \) is known, use the z-distribution (ZInterval).
Tip: "T" in "NOT" reminds you to use TInterval when \( \sigma \) is not known.
Sample Size Determination for Means (\( \sigma \) Known)
To estimate the required sample size for a desired margin of error:
If \( \sigma \) is unknown, estimate it using:
Range Rule of Thumb:
Sample standard deviation from preliminary data
Results from previous studies
Examples
Estimating Mean Change in Cholesterol: 49 subjects, mean change = 0.4, s = 21.0, 95% confidence interval using t-distribution.
Estimating Mean Weight: n = 40, \( \bar{x} = 172.55 \), \( \sigma = 26 \), 95% confidence interval using z-distribution.
Estimating Sample Size for IQ: \( \sigma = 15 \), E = 3, 95% confidence interval.
Estimating Sample Size for Pulse Rate: Range = 123 - 43 = 80, \( \sigma \approx 20 \), E = 22, 95% confidence interval.
Practice Problems
Construct a 99% confidence interval for the mean calories in 1-oz cookies (n = 15, data provided). Use t-distribution.
Estimate the mean time for statistics students to perceive 1 minute (n = 40, \( \bar{x} = 58.3 \), \( \sigma = 9.5 \)), 95% confidence interval using z-distribution.
Estimate the mean body temperature (n = 106, \( \bar{x} = 98.2 \), s = 0.62), 99% confidence interval using t-distribution.
Estimate the mean mercury in tuna sushi (data provided), 98% confidence interval using t-distribution. Assess if the mean exceeds FDA guideline of 1 ppm.
Summary Table: Confidence Intervals for Proportions and Means
Parameter | Point Estimate | Distribution | Confidence Interval Formula | Sample Size Formula |
|---|---|---|---|---|
Proportion (p) | \( \hat{p} = \frac{x}{n} \) | z (normal) | \( \hat{p} \pm z_{\alpha/2} \sqrt{ \frac{\hat{p}(1-\hat{p})}{n} } \) | \( n = \frac{ z_{\alpha/2}^2 \hat{p} (1-\hat{p}) }{ E^2 } \) |
Mean (\( \mu \)), \( \sigma \) known | \( \bar{x} \) | z (normal) | \( \bar{x} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}} \) | \( n = \left( \frac{ z_{\alpha/2} \sigma }{ E } \right)^2 \) |
Mean (\( \mu \)), \( \sigma \) unknown | \( \bar{x} \) | t (Student's t) | \( \bar{x} \pm t_{\alpha/2} \frac{s}{\sqrt{n}} \) | Estimate \( \sigma \) as needed |
Additional info: The above notes expand on the original content by providing explicit formulas, requirements, and step-by-step explanations for constructing and interpreting confidence intervals and determining sample sizes for both proportions and means. Examples and summary tables are included for clarity and exam preparation.