Skip to main content
Back

Foundations of Statistics: Study Guide and Key Terms

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Overview of Statistical Concepts

This study guide outlines the foundational topics and key terms essential for a college-level statistics course. The material is organized by major themes, including types of studies, data classification, descriptive statistics, and linear regression. Each section provides definitions, examples, and important formulas to support exam preparation.

Types of Statistical Studies

Descriptive vs. Inferential Statistics

Statistics can be broadly classified into two categories: descriptive statistics and inferential statistics.

  • Descriptive Statistics: Methods for summarizing and organizing data using numbers, tables, and graphs.

  • Inferential Statistics: Techniques for making generalizations or predictions about a population based on a sample.

Example: Calculating the average test score of a class (descriptive) vs. using that average to estimate the average score of all students in the school (inferential).

Observational Studies vs. Designed Experiments

  • Observational Study: Researchers observe subjects without manipulating variables.

  • Designed Experiment: Researchers actively impose treatments and control variables to study effects.

Example: Surveying people about their eating habits (observational) vs. assigning diets to groups and measuring outcomes (experiment).

Sampling Methods

  • Simple Random Sample: Every member of the population has an equal chance of being selected.

  • Systematic Sampling: Selecting every k-th member from a list.

  • Cluster Sampling: Dividing the population into groups (clusters) and randomly selecting entire clusters.

  • Stratified Sampling: Dividing the population into strata and sampling from each stratum.

Formula for Simple Random Sampling:

Data Classification and Organization

Types of Data

  • Qualitative (Categorical) Data: Non-numeric data that describes categories or groups.

  • Quantitative Data: Numeric data representing counts or measurements.

  • Discrete Data: Countable values (e.g., number of students).

  • Continuous Data: Measurable values that can take any value within a range (e.g., height).

Frequency Distributions and Graphs

  • Frequency Distribution: A table showing the number of occurrences for each category or interval.

  • Relative Frequency: The proportion of observations in each category.

  • Bar Chart: Used for qualitative data.

  • Histogram: Used for quantitative data, showing frequency of data within intervals.

  • Dotplot, Stem-and-Leaf Plot: Visual tools for displaying data distribution.

Formula for Relative Frequency:

Descriptive Statistics

Measures of Central Tendency

  • Mean (Average):

  • Median: The middle value when data are ordered.

  • Mode: The value that appears most frequently.

Measures of Spread

  • Range: Difference between the highest and lowest values.

  • Variance:

  • Standard Deviation:

  • Interquartile Range (IQR):

Boxplots and Five-Number Summary

  • Five-Number Summary: Minimum, , Median, , Maximum

  • Boxplot: A graphical representation of the five-number summary.

Z-Scores

  • Z-Score: Measures how many standard deviations a value is from the mean.

Linear Regression and Correlation

Linear Equations and Least-Squares Regression

  • Regression Line Equation:

  • Least-Squares Criterion: Minimizes the sum of squared residuals.

  • Residual:

Correlation

  • Pearson Correlation Coefficient (r): Measures the strength and direction of a linear relationship between two variables.

  • Coefficient of Determination (): Proportion of variance in the dependent variable explained by the independent variable.

Outliers and Influential Observations

  • Outlier: A data point that lies far from other observations.

  • Influential Observation: A point that significantly affects the regression line.

Key Terms Table

The following table summarizes some of the most important terms introduced in the study guide:

Term

Definition

Population

The entire group of individuals or items under study.

Sample

A subset of the population selected for analysis.

Parameter

A numerical summary of a population.

Statistic

A numerical summary of a sample.

Variable

A characteristic or attribute that can assume different values.

Mean

The arithmetic average of a set of values.

Standard Deviation

A measure of the spread of data around the mean.

Regression Line

The best-fitting straight line through a set of data points in linear regression.

Correlation Coefficient

A measure of the strength and direction of a linear relationship between two variables.

Summary

This guide provides a concise overview of the essential concepts, methods, and terminology in introductory statistics. Mastery of these topics is fundamental for further study and application in data analysis, research, and interpretation of statistical results.

Pearson Logo

Study Prep