Skip to main content
Back

MA 162 Modules 4-6 Study Guide: Hypothesis Testing, Proportions, Chi-Square, Regression, and ANOVA

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Hypothesis Testing for One Sample

The Basics of Hypothesis Testing

Hypothesis testing is a fundamental statistical method used to make inferences about population parameters based on sample data. It involves formulating two competing hypotheses and using sample evidence to decide which is more plausible.

  • Null Hypothesis (H0): The statement being tested, usually representing no effect or no difference.

  • Alternative Hypothesis (Ha): The statement we are trying to find evidence for, representing an effect or difference.

  • Logic of Hypothesis Testing: We collect sample data and determine whether it is consistent with H0 or provides enough evidence to reject it in favor of Ha.

  • Type I Error: Rejecting H0 when it is true (false positive). Probability is denoted by .

  • Type II Error: Failing to reject H0 when Ha is true (false negative). Probability is denoted by .

  • Conclusions: "Reject H0" or "Fail to reject H0" based on the evidence.

Hypothesis Testing for One Population Mean

Depending on whether the population standard deviation is known or unknown, different tests are used to assess the mean.

  • Critical-Value Approach: Compare the test statistic to a critical value from the appropriate distribution.

  • P-Value Approach: Calculate the probability of observing a test statistic as extreme as the one obtained, given H0 is true.

  • Confidence Interval Approach: If the confidence interval includes the null value (often 0), do not reject H0.

  • Z-Test: Used when the population standard deviation is known.

    • Formula:

  • T-Test: Used when the population standard deviation is unknown and estimated by sample standard deviation .

    • Formula:

Example: Testing whether the average height of students differs from 65 inches using a sample of 30 students.

Hypothesis Testing for Two Samples

Comparing Two Population Means

When comparing means from two populations, the method depends on whether samples are independent or paired, and whether population standard deviations are equal.

  • Pooled T-Test: Used for independent samples with equal population standard deviations.

    • Formula:

    • Pooled standard deviation:

  • Non-Pooled T-Test: Used for independent samples with unequal population standard deviations.

    • Formula:

  • Paired T-Test: Used for paired samples (e.g., before-and-after measurements).

    • Calculate differences, then perform a one-sample T-Test on the differences.

    • Formula:

Example: Comparing test scores between two teaching methods using independent samples.

Inferences for Population Proportions

Confidence Intervals and Hypothesis Testing for Proportions

Statistical inference for proportions involves estimating the proportion of a population with a certain characteristic and testing hypotheses about it.

  • Large-Sample Confidence Interval: Used when sample size is large enough for normal approximation.

    • Formula:

  • Margin of Error: The maximum expected difference between the true population proportion and the sample estimate.

    • Formula:

  • Relationship: Increasing sample size or decreasing confidence level reduces margin of error.

  • Hypothesis Test for Proportion: Test statistic:

    • Formula:

Example: Estimating the proportion of students who prefer online classes and testing if it differs from 50%.

Chi-Square Tests

Goodness of Fit and Independence Tests

Chi-square tests are used to assess whether observed data fit an expected distribution or whether two categorical variables are associated.

  • Goodness of Fit Test: Determines if a sample distribution matches a hypothesized distribution.

    • Formula:

    • Oi: Observed frequency; Ei: Expected frequency

  • Independence Test: Assesses whether two categorical variables are independent.

    • Use contingency tables to compare observed and expected frequencies.

  • Important Note: Expected frequencies must be used, not percentages.

Example: Testing if color preference is independent of gender in a sample of 1500 people.

Regression and Correlation

Simple Linear Regression and Correlation

Regression and correlation are used to analyze relationships between two quantitative variables.

  • Regression Equation: where is the response variable and is the predictor variable.

  • Interpretation: The slope indicates the change in for a one-unit increase in .

  • Predictor vs. Response Variable: Predictor (independent variable), Response (dependent variable).

  • Linear Correlation Coefficient (r): Measures strength and direction of linear relationship.

    • Range:

  • Coefficient of Determination (r2): Proportion of variance in explained by .

  • Relationship: is the square of .

Example: Predicting final exam scores based on hours studied.

Inferential Methods in Regression and Correlation

Linear Regression T-Test and Prediction

Inferential methods in regression assess whether the predictor variable is useful for predicting the response variable.

  • Linear Regression T-Test: Tests if the slope of the regression line is significantly different from zero.

    • Null hypothesis:

    • Alternative hypothesis:

    • Test statistic:

  • Prediction: Use the regression equation to predict for given values.

  • Interpretation: If the slope is not zero, is useful for predicting .

Example: Testing if hours studied significantly predict exam scores.

Summary Table: Types of Hypothesis Tests

The following table summarizes the main types of hypothesis tests covered:

Test Type

Population Parameter

Sample Type

Distribution Used

Z-Test

Mean ()

One sample, known

Normal (Z)

T-Test

Mean ()

One sample, unknown

Student's t

Pooled T-Test

Difference of means ()

Two independent samples, equal variances

Student's t

Non-Pooled T-Test

Difference of means ()

Two independent samples, unequal variances

Student's t

Paired T-Test

Mean of differences

Paired samples

Student's t

Z-Test for Proportion

Proportion ()

One sample, large n

Normal (Z)

Chi-Square Test

Distribution/Association

Categorical data

Chi-Square ()

Linear Regression T-Test

Slope ()

Regression data

Student's t

Additional info: This summary table is inferred from the study guide's listed tests and their typical properties in statistics courses.

Pearson Logo

Study Prep