BackMAT 137: Midterm Study Guide (Chapters 1–5)
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Statistics, Data, and Statistical Thinking
Introduction to Statistics
Statistics is the science of collecting, organizing, analyzing, and interpreting data to make informed decisions. It is widely used in business, economics, and many other fields to draw conclusions from data.
Population: The entire group of individuals or items of interest.
Sample: A subset of the population, selected for analysis.
Parameter: A numerical summary of a population.
Statistic: A numerical summary of a sample.
Example: If a company wants to know the average salary of its employees (population), it may survey 50 employees (sample) and calculate the average (statistic).
Methods for Describing Sets of Data
Collecting Data & Sampling Methods
Proper data collection is essential for valid statistical analysis. Sampling methods determine how samples are selected from the population.
Simple Random Sampling: Every member of the population has an equal chance of being selected.
Stratified Sampling: The population is divided into subgroups (strata), and samples are taken from each stratum.
Cluster Sampling: The population is divided into clusters, some clusters are randomly selected, and all members of chosen clusters are sampled.
Systematic Sampling: Every kth member of the population is selected.
Example: To survey customer satisfaction, a company might use stratified sampling to ensure all age groups are represented.
Visualizing Qualitative vs. Quantitative Data
Qualitative Data: Non-numeric data (e.g., colors, brands).
Quantitative Data: Numeric data (e.g., sales, heights).
Common visualizations include:
Bar Charts: For qualitative data.
Histograms: For quantitative data, showing frequency distributions.
Frequency Distributions & Histograms
Frequency Distribution: A table that displays the frequency of various outcomes in a sample.
Histogram: A graphical representation of a frequency distribution for quantitative data.
Example: A histogram can show the distribution of exam scores in a class.
Measures of Central Tendency and Variation
Mean (\( \bar{x} \)): The arithmetic average of a data set. Formula:
Median: The middle value when data are ordered.
Standard Deviation (s): Measures the spread of data around the mean. Formula:
Percentiles & Quartiles: Indicate the relative standing of a value within a data set.
25th percentile = 1st quartile (Q1)
50th percentile = median (Q2)
75th percentile = 3rd quartile (Q3)
Example: The 90th percentile on a test means a student scored higher than 90% of test-takers.
Probability
Basic Concepts of Probability
Probability quantifies the likelihood of an event occurring, ranging from 0 (impossible) to 1 (certain).
Complement: The probability that an event does not occur. Formula:
Addition Rule: For events A and B: Formula:
Example: If the probability of rain is 0.3, the probability it does not rain is 0.7.
Contingency Tables
Contingency tables display the frequency distribution of variables to analyze relationships between categorical variables.
Category 1 | Category 2 | Total | |
|---|---|---|---|
Group A | n11 | n12 | n1. |
Group B | n21 | n22 | n2. |
Total | n.1 | n.2 | n |
Example: A table showing the number of customers who prefer different brands by age group.
Random Variables and Probability Distributions
Discrete Random Variables & Binomial Distribution
Discrete Random Variable: Takes on countable values (e.g., number of sales).
Binomial Distribution: Models the number of successes in n independent trials with probability p of success.
Binomial probability formula:
Example: Flipping a coin 10 times and counting the number of heads.
Finding Binomial Probabilities in Excel: Use BINOM.DIST(k, n, p, FALSE) for exact probabilities.
Normal Distribution
Standard Normal Distribution: A normal distribution with mean 0 and standard deviation 1.
Non-Standard Normal Distribution: Any normal distribution with mean \( \mu \) and standard deviation \( \sigma \).
Standardizing a value (z-score):
Finding Probabilities, Z Values, and X Values in Excel: Use NORM.DIST and NORM.INV functions.
Sampling Distributions
Sampling Distribution of the Sample Mean & Central Limit Theorem
The sampling distribution of the sample mean describes the distribution of sample means from all possible samples of a given size from a population.
Central Limit Theorem (CLT): For large sample sizes (n > 30), the sampling distribution of the sample mean is approximately normal, regardless of the population's distribution.
Mean of Sampling Distribution:
Standard Error:
Example: The average height from many samples of 50 people each will be normally distributed if n is large enough.
Finding Distribution of Sample Mean in Excel: Use NORM.DIST with the sample mean and standard error.
Sampling Distribution of Sample Proportion
Sample Proportion (\( \hat{p} \)): The proportion of successes in a sample.
Mean:
Standard Error:
Example: If 60 out of 100 surveyed customers prefer a product, \( \hat{p} = 0.6 \).
Additional info: This guide covers all major topics listed for the midterm, organized by textbook chapters and expanded with definitions, formulas, and examples for clarity and exam preparation.