BackHypothesis Testing for Two Population Parameters: Means and Proportions
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Hypothesis Testing for Two Population Parameters
Introduction
In business statistics, comparing two populations is a common scenario. For example, a company may want to know if customers from two regions purchase at different rates, or if a new process is more efficient than an old one. Hypothesis testing for two population parameters allows us to answer such questions using statistical methods.
Tests for Two Population Means Using Independent Samples
Understanding Independent Samples
Independent samples are samples selected from two populations where the occurrence of values in one sample does not affect the other.
Example: Comparing math test scores from two separate classes.
Non-example: Comparing scores from the same students before and after tutoring (these are paired samples).
When to Use the t-Test for Two Means
When the population standard deviations are unknown (the most common real-world scenario).
We use the t-distribution instead of the normal distribution.
Degrees of freedom must be calculated to find the appropriate critical t-value.
Assumptions About Population Variances
We may assume variances are not equal (Welch's t-test, more versatile and conservative).
We may assume variances are equal (Pooled t-test, less conservative, but less often justified in practice).
Note: In business applications, we often do not know if variances are equal, so the unequal variance test is preferred.
Test Statistic for Two Means (Unequal Variances)
The test statistic is calculated as:
Where are sample means, are sample variances, and are sample sizes.
Degrees of freedom (df) are approximated by:
Example: Comparing Checkout Page Designs
A marketing team tests if a new shopping cart checkout page reduces time spent compared to the old version.
15 users are randomly assigned to each version, and time spent is measured.
Version | Mean Time (s) |
|---|---|
New | 45 |
Old | 48 |
Null hypothesis: No difference in mean times.
Alternative hypothesis: New version is more efficient (one-tailed test).
Significance level:
Calculate t-statistic and compare to critical t-value to make a decision.
Example: Comparing Sales at Two Locations
Store | Mean Sales ($) | Std. Dev. | n |
|---|---|---|---|
A | 150 | 10 | 15 |
B | 155 | 12 | 15 |
Null hypothesis: Mean sales are equal.
Significance level:
Calculate t-statistic and compare to critical t-value ( for df = 28).
Conclusion: If is less than critical value, do not reject the null hypothesis.
Confidence Interval for Difference of Means
The confidence interval for is:
Where is the critical value from the t-distribution for the chosen confidence level.
Tests for Paired Samples (Dependent Samples)
Understanding Paired Samples
Paired samples (dependent samples) are matched or related in some way, such as before-and-after measurements on the same subjects.
Common in pre-post studies, or when controlling for extraneous factors.
Example: Measuring sales for the same customers before and after a marketing campaign.
Hypothesis Testing for Paired Differences
Null hypothesis: The mean paired difference is zero ().
Alternative hypothesis: The mean paired difference is not zero, greater than zero, or less than zero (two-tailed or one-tailed tests).
Calculating the Paired t-Test Statistic
Compute the difference for each pair:
Calculate the mean and standard deviation of the differences:
The test statistic is:
Where is the hypothesized mean difference (often 0).
Example: Comparing Time Spent on Movie Genres
User | Comedy (hrs) | Drama (hrs) | Difference (d) |
|---|---|---|---|
1 | 2 | 1 | 1 |
2 | 4 | 1.5 | 2.5 |
3 | 2 | 4 | -2 |
Null hypothesis: No difference in time spent.
Alternative hypothesis: Time spent on drama is less than comedy (left-tailed test).
Calculate , , and t-statistic; compare to critical t-value for degrees of freedom.
Confidence Interval for Mean Paired Difference
The confidence interval is:
Where is the critical value for degrees of freedom.
Tests for Two Population Proportions
Introduction to Proportion Tests
Used to compare the proportion of successes in two populations.
Example: Comparing purchase completion rates between two customer groups.
Test Statistic for Difference in Proportions
Let and be the sample proportions from populations 1 and 2, with sample sizes and .
The pooled proportion is:
The test statistic is:
Assumptions: Each group should have at least 5 expected successes and failures (, , etc.).
Example: Comparing Product Failure Rates
Design | Failures | Sample Size | Proportion |
|---|---|---|---|
Old | 60 | 250 | 0.24 |
New | 50 | 250 | 0.20 |
Null hypothesis: No difference in failure rates.
Alternative hypothesis: New design has a lower failure rate (one-tailed test).
Calculate pooled proportion, z-statistic, and compare to critical z-value (e.g., for one-tailed test).
Confidence Interval for Difference in Proportions
The confidence interval for is:
Where is the critical value from the standard normal distribution for the chosen confidence level.
Summary Table: Choosing the Right Test
Scenario | Test | Key Assumptions |
|---|---|---|
Two independent means, unknown variances | Two-sample t-test (Welch's) | Samples independent, normality, variances not assumed equal |
Paired/dependent samples | Paired t-test | Differences are normally distributed |
Two proportions | z-test for proportions | Large samples, at least 5 expected successes/failures per group |
Additional info: In practice, always check assumptions (normality, independence, equal variances if required) before applying these tests. Use statistical software for calculations when sample sizes are large or formulas are complex.