Chapter 5

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Discrete Probability Distributions

Random Variables and Probability Distributions

A random variable is a variable that takes a single numerical value determined by chance for each outcome of a procedure. Probability distributions describe the likelihood of each possible value of a random variable, often using tables, formulas, or graphs.

Discrete random variable: Has a finite or countable set of values (e.g., number of coin tosses until heads).
Continuous random variable: Has infinitely many values, not countable individually (e.g., body temperature).
This chapter focuses exclusively on discrete random variables.

Requirements for a Probability Distribution

Every probability distribution must satisfy the following:

There is a numerical random variable x with associated probabilities.
(the sum of all probabilities must be 1; minor rounding errors are acceptable).
for every value of x (each probability must be between 0 and 1 inclusive).

Examples of Probability Distributions

Consider the example of the number of females in two births, assuming male and female births are equally likely.

x: Number of females in two births	P(x)
0	0.25
1	0.50
2	0.25

This table satisfies all requirements for a probability distribution.

Contrast with the following table, which lists the proportion of unlicensed software in various countries:

Country	Proportion of Unlicensed Software
United States	0.17
China	0.70
India	0.58
Russia	0.64
Total	2.09

This table does not describe a probability distribution because the sum of probabilities exceeds 1 and the variable is not numerical in the sense required for probability distributions.

Probability Histograms

A probability histogram visually represents a probability distribution. The vertical axis shows probabilities, and each bar is centered around the value of the random variable. All bars have equal width.

Special Notation: 0+

When a probability value is positive but very small (e.g., 0.000000123), it is sometimes rounded and represented as 0+ to avoid misleadingly suggesting the event is impossible.

Parameters of Probability Distributions

For probability distributions, the mean, variance, and standard deviation are parameters (not statistics), as they describe the entire population.

Mean (μ): The theoretical average outcome for infinitely many trials.
Variance (σ2): Measures the spread of the distribution.
Standard deviation (σ): The square root of the variance.

Formulas for Discrete Probability Distributions

Mean:
Variance (conceptual):
Variance (computational):
Standard deviation:

Rounding Rules

Carry one more decimal place than the random variable x when rounding results.
If x is an integer, round to one decimal place.
Only round at the end of computations, not during intermediate steps.

Expected Value

The expected value of a discrete random variable x is denoted by E and is equal to the mean:

Represents the average value expected over many trials.

Worked Example: Number of Females in Two Births

x	P(x)	x*P(x)	(x – μ)2*P(x)
0	0.25	0	Additional info: To be calculated after μ is found
1	0.50	0.5	Additional info: To be calculated after μ is found
2	0.25	0.5	Additional info: To be calculated after μ is found
Total	1.00	1.0	Additional info: To be calculated

Mean: Variance: Standard deviation: (rounded to one decimal place)

Additional info: Calculations for variance and standard deviation use the formulas above.

Range Rule of Thumb for Identifying Significant Values

The range rule of thumb helps identify significant values in a probability distribution:

Significant low values: or lower
Significant high values: or higher
Not significant: Values between and

Note: The use of "2" is a guideline, not a strict rule.

Identifying Significant Results with Probabilities

Significantly high number of successes: If
Significantly low number of successes: If

These criteria help determine whether observed outcomes are unusual or noteworthy in the context of the probability distribution.