Skip to main content
Back

Exam 2 Review Guide: Statistics Concepts (Chapters 6–8, 10–12)

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Scatterplots, Association, and Correlation

Constructing and Interpreting Scatterplots

Scatterplots are graphical representations used to visualize the relationship between two quantitative variables.

  • Explanatory Variable: The variable that is presumed to explain or influence changes in the other variable (often plotted on the x-axis).

  • Response Variable: The variable that is measured as the outcome (often plotted on the y-axis).

  • Association: Describes the direction (positive/negative), form (linear/nonlinear), and strength of the relationship between variables.

  • Correlation Coefficient (r): Measures the strength and direction of a linear relationship between two variables. Values range from -1 to 1.

Example: Plotting height (explanatory) vs. weight (response) for a group of individuals.

Formula:

Linear Regression

Conditions and Assumptions for Linear Models

Linear regression models the relationship between two quantitative variables using a straight line.

  • Linearity: The relationship between variables should be linear.

  • Independence: Observations should be independent.

  • Equal Variance: The spread of residuals should be roughly constant.

  • Normality: Residuals should be approximately normally distributed.

Finding the Linear Regression Model

  • Regression Equation:

  • Calculating Slope () and Intercept ():

  • Use a calculator or formulas to find and .

  • Interpretation: Slope () indicates the change in for each unit increase in . Intercept () is the predicted value of $y$ when .

  • Predicted Values: Use the regression equation to estimate for given values.

  • Coefficient of Determination (): Represents the proportion of variance in the response variable explained by the model.

  • Residuals: The difference between observed and predicted values:

  • Residual Plot: Used to assess the fit of the model and check assumptions.

Example: Predicting exam scores based on hours studied.

Regression Wisdom

Outliers, Leverage, and Influence

Understanding the impact of individual data points on regression models is crucial.

  • Outlier: A data point that lies far from the other observations.

  • Leverage: A data point with an extreme value for the explanatory variable.

  • Influential Point: A data point that significantly affects the regression line.

  • High Residuals/Leverage: Can distort the regression model, making it less reliable.

  • Extrapolation: Making predictions outside the range of observed data; often unreliable.

Example: A single high-value outlier can change the slope of the regression line.

Sample Surveys

Populations, Samples, and Sampling Methods

Surveys are used to collect data from a subset of a population to make inferences about the whole.

  • Population: The entire group of interest.

  • Sample: A subset of the population.

  • Population Parameter: A numerical summary of the population.

  • Statistic: A numerical summary of the sample.

  • Pilot Survey: A small-scale survey used to test procedures and questions.

Types of Sampling

  • Simple Random Sample: Every member has an equal chance of being selected.

  • Stratified Sampling: Population divided into strata, random samples taken from each.

  • Cluster Sampling: Population divided into clusters, entire clusters are sampled.

  • Systematic Sampling: Every nth member is selected.

  • Convenience Sampling: Sample is taken from easily accessible members.

  • Census: Data collected from every member of the population.

Types of Bias

  • Undercoverage Bias: Some groups are not represented.

  • Response Bias: Survey responses are influenced by wording or interviewer.

  • Nonresponse Bias: Selected individuals do not respond.

  • Voluntary Response Bias: Individuals choose to participate, often those with strong opinions.

Example: Using only online surveys may lead to undercoverage bias if some people lack internet access.

Experiments and Observational Studies

Types of Studies

Research can be conducted through observational studies or experiments.

  • Observational Study: Researchers observe subjects without intervention.

  • Experimental Study: Researchers assign treatments to subjects and observe outcomes.

  • Retrospective Study: Looks backward in time.

  • Prospective Study: Follows subjects forward in time.

Elements of an Experiment

  • Random Assignment: Subjects are randomly assigned to treatments.

  • Factor: The explanatory variable manipulated by the experimenter.

  • Levels: Different values of the factor.

  • Treatments: Combinations of factor levels.

  • Response Variable: The outcome measured.

  • Blinding: Subjects or experimenters do not know which treatment is assigned.

  • Experimental Unit/Subject: The individual receiving the treatment.

Example: Testing a new drug with random assignment and blinding.

From Randomness to Probability

Probability Concepts and Rules

Probability quantifies the likelihood of events in random phenomena.

  • Random Phenomenon: An event whose outcome cannot be predicted with certainty.

  • Probability: A number between 0 and 1 representing the chance of an event.

  • Trial: A single occurrence of a random phenomenon.

  • Outcome: The result of a trial.

  • Event: A collection of outcomes.

  • Independent Events: The outcome of one event does not affect the other.

  • Disjoint Events: Events that cannot occur together.

Properties and Rules of Probability

  • Complement Rule:

  • Addition Rule for Disjoint Events:

  • General Addition Rule:

  • Multiplication Rule for Independent Events:

  • General Multiplication Rule:

  • Conditional Probability:

Using Tables and Diagrams to Find Probabilities

  • Contingency Table: Used to organize data and calculate probabilities.

  • Tree Diagram: Visualizes sequences of events and their probabilities.

  • Venn Diagram: Illustrates relationships between events.

Example: Calculating the probability of drawing a red card or a face card from a deck using a Venn diagram.

Sampling Method

Description

Simple Random Sample

Every member has equal chance of selection

Stratified Sampling

Population divided into strata, random samples from each

Cluster Sampling

Population divided into clusters, entire clusters sampled

Systematic Sampling

Every nth member selected

Convenience Sampling

Sample taken from easily accessible members

Census

Data from every member of population

Type of Bias

Description

Undercoverage Bias

Some groups not represented

Response Bias

Survey responses influenced by wording/interviewer

Nonresponse Bias

Selected individuals do not respond

Voluntary Response Bias

Individuals choose to participate, often with strong opinions

Additional info: Academic context and definitions have been expanded for clarity and completeness. Formulas are provided in LaTeX format for exam preparation.

Pearson Logo

Study Prep