BackChapter 5
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Discrete Probability Distributions
Random Variables and Probability Distributions
A random variable is a variable that takes a single numerical value determined by chance for each outcome of a procedure. Probability distributions describe the likelihood of each possible value of a random variable, often using tables, formulas, or graphs.
Discrete random variable: Has a finite or countable set of values (e.g., number of coin tosses until heads).
Continuous random variable: Has infinitely many values, not countable individually (e.g., body temperature).
This chapter focuses exclusively on discrete random variables.
Requirements for a Probability Distribution
Every probability distribution must satisfy the following:
There is a numerical random variable x with associated probabilities.
(the sum of all probabilities must be 1; minor rounding errors are acceptable).
for every value of x (each probability must be between 0 and 1 inclusive).
Examples of Probability Distributions
Consider the example of the number of females in two births, assuming male and female births are equally likely.
x: Number of females in two births | P(x) |
|---|---|
0 | 0.25 |
1 | 0.50 |
2 | 0.25 |
This table satisfies all requirements for a probability distribution.
Contrast with the following table, which lists the proportion of unlicensed software in various countries:
Country | Proportion of Unlicensed Software |
|---|---|
United States | 0.17 |
China | 0.70 |
India | 0.58 |
Russia | 0.64 |
Total | 2.09 |
This table does not describe a probability distribution because the sum of probabilities exceeds 1 and the variable is not numerical in the sense required for probability distributions.
Probability Histograms
A probability histogram visually represents a probability distribution. The vertical axis shows probabilities, and each bar is centered around the value of the random variable. All bars have equal width.
Special Notation: 0+
When a probability value is positive but very small (e.g., 0.000000123), it is sometimes rounded and represented as 0+ to avoid misleadingly suggesting the event is impossible.
Parameters of Probability Distributions
For probability distributions, the mean, variance, and standard deviation are parameters (not statistics), as they describe the entire population.
Mean (μ): The theoretical average outcome for infinitely many trials.
Variance (σ2): Measures the spread of the distribution.
Standard deviation (σ): The square root of the variance.
Formulas for Discrete Probability Distributions
Mean:
Variance (conceptual):
Variance (computational):
Standard deviation:
Rounding Rules
Carry one more decimal place than the random variable x when rounding results.
If x is an integer, round to one decimal place.
Only round at the end of computations, not during intermediate steps.
Expected Value
The expected value of a discrete random variable x is denoted by E and is equal to the mean:
Represents the average value expected over many trials.
Worked Example: Number of Females in Two Births
x | P(x) | x*P(x) | (x – μ)2*P(x) |
|---|---|---|---|
0 | 0.25 | 0 | Additional info: To be calculated after μ is found |
1 | 0.50 | 0.5 | Additional info: To be calculated after μ is found |
2 | 0.25 | 0.5 | Additional info: To be calculated after μ is found |
Total | 1.00 | 1.0 | Additional info: To be calculated |
Mean: Variance: Standard deviation: (rounded to one decimal place)
Additional info: Calculations for variance and standard deviation use the formulas above.
Range Rule of Thumb for Identifying Significant Values
The range rule of thumb helps identify significant values in a probability distribution:
Significant low values: or lower
Significant high values: or higher
Not significant: Values between and
Note: The use of "2" is a guideline, not a strict rule.
Identifying Significant Results with Probabilities
Significantly high number of successes: If
Significantly low number of successes: If
These criteria help determine whether observed outcomes are unusual or noteworthy in the context of the probability distribution.