Hypothesis Testing for Two Population Parameters: Means and Proportions

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Hypothesis Testing for Two Population Parameters

Introduction

In business statistics, comparing two populations is a common scenario. For example, a company may want to know if customers from two regions purchase at different rates, or if a new process is more efficient than an old one. Hypothesis testing for two population parameters allows us to answer such questions using statistical methods.

Tests for Two Population Means Using Independent Samples

Understanding Independent Samples

Independent samples are samples selected from two populations where the occurrence of values in one sample does not affect the other.
Example: Comparing math test scores from two separate classes.
Non-example: Comparing scores from the same students before and after tutoring (these are paired samples).

When to Use the t-Test for Two Means

When the population standard deviations are unknown (the most common real-world scenario).
We use the t-distribution instead of the normal distribution.
Degrees of freedom must be calculated to find the appropriate critical t-value.

Assumptions About Population Variances

We may assume variances are not equal (Welch's t-test, more versatile and conservative).
We may assume variances are equal (Pooled t-test, less conservative, but less often justified in practice).
Note: In business applications, we often do not know if variances are equal, so the unequal variance test is preferred.

Test Statistic for Two Means (Unequal Variances)

The test statistic is calculated as:

Where are sample means, are sample variances, and are sample sizes.
Degrees of freedom (df) are approximated by:

Example: Comparing Checkout Page Designs

A marketing team tests if a new shopping cart checkout page reduces time spent compared to the old version.
15 users are randomly assigned to each version, and time spent is measured.

Version	Mean Time (s)
New	45
Old	48

Null hypothesis: No difference in mean times.
Alternative hypothesis: New version is more efficient (one-tailed test).
Significance level:
Calculate t-statistic and compare to critical t-value to make a decision.

Example: Comparing Sales at Two Locations

Store	Mean Sales ($)	Std. Dev.	n
A	150	10	15
B	155	12	15

Null hypothesis: Mean sales are equal.
Significance level:
Calculate t-statistic and compare to critical t-value ( for df = 28).
Conclusion: If is less than critical value, do not reject the null hypothesis.

Confidence Interval for Difference of Means

The confidence interval for is:

Where is the critical value from the t-distribution for the chosen confidence level.

Tests for Paired Samples (Dependent Samples)

Understanding Paired Samples

Paired samples (dependent samples) are matched or related in some way, such as before-and-after measurements on the same subjects.
Common in pre-post studies, or when controlling for extraneous factors.
Example: Measuring sales for the same customers before and after a marketing campaign.

Hypothesis Testing for Paired Differences

Null hypothesis: The mean paired difference is zero ().
Alternative hypothesis: The mean paired difference is not zero, greater than zero, or less than zero (two-tailed or one-tailed tests).

Calculating the Paired t-Test Statistic

Compute the difference for each pair:
Calculate the mean and standard deviation of the differences:

The test statistic is:

Where is the hypothesized mean difference (often 0).

Example: Comparing Time Spent on Movie Genres

User	Comedy (hrs)	Drama (hrs)	Difference (d)
1	2	1	1
2	4	1.5	2.5
3	2	4	-2

Null hypothesis: No difference in time spent.
Alternative hypothesis: Time spent on drama is less than comedy (left-tailed test).
Calculate , , and t-statistic; compare to critical t-value for degrees of freedom.

Confidence Interval for Mean Paired Difference

The confidence interval is:

Where is the critical value for degrees of freedom.

Tests for Two Population Proportions

Introduction to Proportion Tests

Used to compare the proportion of successes in two populations.
Example: Comparing purchase completion rates between two customer groups.

Test Statistic for Difference in Proportions

Let and be the sample proportions from populations 1 and 2, with sample sizes and .
The pooled proportion is:

The test statistic is:

Assumptions: Each group should have at least 5 expected successes and failures (, , etc.).

Example: Comparing Product Failure Rates

Design	Failures	Sample Size	Proportion
Old	60	250	0.24
New	50	250	0.20

Null hypothesis: No difference in failure rates.
Alternative hypothesis: New design has a lower failure rate (one-tailed test).
Calculate pooled proportion, z-statistic, and compare to critical z-value (e.g., for one-tailed test).

Confidence Interval for Difference in Proportions

The confidence interval for is:

Where is the critical value from the standard normal distribution for the chosen confidence level.

Summary Table: Choosing the Right Test

Scenario	Test	Key Assumptions
Two independent means, unknown variances	Two-sample t-test (Welch's)	Samples independent, normality, variances not assumed equal
Paired/dependent samples	Paired t-test	Differences are normally distributed
Two proportions	z-test for proportions	Large samples, at least 5 expected successes/failures per group

Additional info: In practice, always check assumptions (normality, independence, equal variances if required) before applying these tests. Use statistical software for calculations when sample sizes are large or formulas are complex.