BackRandom Variables, Sampling Distributions, Confidence Intervals, and Hypothesis Testing: Key Concepts for Business Statistics
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Random Variables and Probability Distributions
Definition and Types of Random Variables
A random variable is a numeric value associated with the outcome of a probability experiment. Random variables are classified as either discrete or continuous:
Discrete Random Variable: Takes on a countable set of values, often whole numbers. Typically associated with counting (e.g., number of defective items).
Continuous Random Variable: Can take on any value within an interval. Typically associated with measurement (e.g., weight, time, volume).
Common notation for random variables includes X, Y, and Z.
Probability Distributions
Discrete Probability Distribution: Represented by a table or chart listing all possible values and their probabilities.
Continuous Probability Distribution: Represented by a smooth curve called a probability density function (PDF). The total area under the curve equals 1.
Key Properties and Formulas for Discrete Random Variables
Mean (Expected Value):
Variance:
Key Discrete Random Variables
Binomial Random Variable:
Fixed number of independent trials
Each trial has two outcomes: success or failure
Constant probability of success, p
Counts the number of successes in n trials
Calculator commands: BINOMPDF (exactly X successes), BINOMCDF (X or fewer successes)
Poisson Random Variable:
Counts the number of events in a fixed interval of time or space
Events occur independently
Constant average rate, λ (lambda)
Calculator commands: POISSONPDF (exactly X events), POISSONCDF (X or fewer events)
Hypergeometric Random Variable:
Sampling without replacement from a finite population with known composition
Useful for probability of a certain composition in the sample (e.g., selecting colored marbles from a jar)
Continuous Random Variables
Result from measurement; can take any value within an interval
Probability distribution described by a probability density function (PDF)
Probability is the area under the PDF curve over an interval
Normal Random Variable
Most important continuous random variable
PDF is bell-shaped, centered at the mean (μ), with inflection points at one standard deviation (σ) from the mean
Calculator commands: NORMALCDF (probabilities), INVNORM (find value for a given percentile)
Standard Normal Random Variable (Z)
Mean = 0, Standard deviation = 1
Z-scores represent the number of standard deviations from the mean
Useful for comparing different normal distributions (e.g., SAT vs. ACT scores)
Sampling Distributions
Definition and Importance
A sampling distribution is the probability distribution of a statistic (such as the sample mean or sample proportion) computed from a random sample. Sampling distributions help us understand the variability of sample statistics and make inferences about population parameters.
Key Properties
Sample averages (\bar{x}) and sample proportions (\hat{p}) are continuous random variables.
Their probability distributions are called sampling distributions.
Formulas for Sampling Distributions
Sample Mean:
Expected value:
Standard deviation: (approximated by for large samples)
Sample Proportion:
Expected value:
Standard deviation:
Central Limit Theorem (CLT)
If sample size n ≥ 30, the sampling distribution of the sample mean is approximately normal, regardless of the population's distribution.
If the population is normal, the sampling distribution of the sample mean is normal for any sample size.
Larger samples yield a tighter (less variable) sampling distribution.
Estimator Properties
Unbiased Estimator: An estimator whose expected value equals the population parameter (e.g., sample mean for population mean).
Minimum Variance: Among all unbiased estimators, the one with the smallest variance is preferred.
Confidence Intervals for Population Mean (μ) and Proportion (p)
Definition and Interpretation
A confidence interval (CI) is a range of values, derived from sample statistics, that is likely to contain the population parameter with a specified level of confidence (e.g., 95%).
Confidence Level (CL): The probability that the CI contains the parameter in repeated sampling (e.g., 95%).
Margin of Error: The half-width of the CI; reflects sampling variability.
Constructing Confidence Intervals
Large Sample CI for μ: Use the normal distribution (Central Limit Theorem applies). Substitute sample standard deviation s for population σ if unknown. Calculator command: Zinterval Critical value: Find z for area α/2 in each tail using INVNORM.
Small Sample CI for μ: Use the t-distribution if the population is approximately normal. Calculator command: Tinterval Degrees of freedom (DF): n – 1 Critical value: Find t for area α/2 in each tail using INVT.
Large Sample CI for p: Requires at least 15 successes and 15 failures in the sample. Use normal approximation. Calculator command: 1prop-ZINT
Small Sample CI for p: Used when there are fewer than 15 successes or failures. No standard calculator command; see detailed notes for methods.
Determining Sample Size: To achieve a desired margin of error, use p = 0.5 if no prior estimate is available.
Alpha (α): α = 1 – confidence level; determines critical z or t values (area α/2 in each tail).
Hypothesis Testing for Population Mean (μ) and Proportion (p)
Key Concepts and Steps
Null Hypothesis (H₀): Represents the status quo; always contains an equality (e.g., H₀: μ = μ₀).
Alternative Hypothesis (Hₐ): Represents the claim to be tested; uses >, <, or ≠.
Type I Error (α): Rejecting H₀ when it is true.
Type II Error (β): Failing to reject H₀ when it is false.
Significance Level (α): Probability of a Type I error; sets the threshold for rejection.
Test Statistic: Converts the sample statistic to a z or t value for comparison.
Critical Value: The z or t value marking the start of the rejection region.
Rejection Region: Area(s) in the tails where H₀ is rejected.
P-value: Probability of observing a result as extreme as the sample, assuming H₀ is true.
Decision Rule:
If p-value < α, reject H₀.
If p-value > α, fail to reject H₀ (H₀ is plausible).
Alternatively, compare test statistic to critical value(s).
Reporting Results:
"There is sufficient evidence at α = xx to reject H₀ and accept Hₐ."
"There is insufficient evidence at α = xx to reject H₀. H₀ is plausible."
Types of Tests
One-Tailed Test (Left): Hₐ: parameter < value
One-Tailed Test (Right): Hₐ: parameter > value
Two-Tailed Test: Hₐ: parameter ≠ value
Test Selection and Calculator Commands
Large Sample Test for μ: Use normal distribution; ZTEST
Small Sample Test for μ: Use t-distribution if population is normal; TTEST
Large Sample Test for p: Use normal distribution if n*p₀ and n*(1–p₀) ≥ 15; 1propZtest
Summary Table: Key Random Variables and Their Properties
Random Variable | Type | Key Properties | Calculator Command |
|---|---|---|---|
Binomial | Discrete | Fixed n, independent trials, constant p, two outcomes | BINOMPDF, BINOMCDF |
Poisson | Discrete | Counts events in interval, constant rate λ, independence | POISSONPDF, POISSONCDF |
Hypergeometric | Discrete | Sampling without replacement, known population composition | -- |
Normal | Continuous | Bell-shaped, mean μ, std dev σ | NORMALCDF, INVNORM |
Standard Normal (Z) | Continuous | Mean 0, std dev 1 | NORMALCDF, INVNORM |
Example: Constructing a 95% Confidence Interval for the Mean
Suppose a sample of n = 36 has a mean of 50 and a standard deviation of 12.
95% CI for μ:
For 95% confidence,
Margin of error:
CI: (50 – 3.92, 50 + 3.92) = (46.08, 53.92)
Example: Hypothesis Test for a Proportion
Suppose in a sample of 100, 60 are successes. Test H₀: p = 0.5 vs. Hₐ: p ≠ 0.5 at α = 0.05.
Test statistic:
, ,
p-value ≈ 0.0455 (two-tailed)
Since p-value < 0.05, reject H₀.
Additional info: Some calculator commands and detailed procedures are referenced for classroom calculators (e.g., TI-84), but the underlying statistical principles apply universally.