BackFoundations of Statistics: Populations, Samples, and Data Classification
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Introduction to Statistics
Objectives
Define statistics and its key concepts.
Distinguish between a population and a sample, and between a parameter and a statistic.
Differentiate between descriptive and inferential statistics.
Definition of Statistics
Statistics is the science of collecting, organizing, analyzing, and interpreting data to make decisions.
Data
Data consists of information coming from observations, counts, measurements, or responses.
Example:
7 in 10 Americans believe the arts unify their communities.
21% of 8–11 year-olds have a social media profile.
Data Sets
Population: The collection of all outcomes, responses, measurements, or counts that are of interest.
Sample: A subset, or part, of the population.
Example: Identifying Data Sets
In a survey, 834 employees in the U.S. were asked if they thought their jobs were highly stressful. Of the 834 respondents, 517 said yes.
Population: All employees in the U.S.
Sample: The 834 employees surveyed.
Sample Data Set: 517 "yes" and 317 "no" responses.
Parameters and Statistics
Parameter
A parameter is a numerical description of a population characteristic.
Example: Average age of all people in the United States.
Statistic
A statistic is a numerical description of a sample characteristic.
Example: Average age of people from a sample of three states.
Example: Distinguishing Parameter and Statistic
If a survey of 9400 individuals finds an average of 5.19 hours per day spent in leisure, this is a statistic (since it is based on a sample).
If the average SAT math score for an entire freshman class is 514, this is a parameter (since it is based on the whole population).
If 34% of stores in a sample were not storing fish at the proper temperature, this is a statistic.
Branches of Statistics
Descriptive Statistics
Involves the organization, summarization, and display of data.
Examples: Tables, charts, averages.
Inferential Statistics
Involves using sample data to draw conclusions about a population.
Example: Descriptive and Inferential Statistics
A study of 1502 U.S. adults found that 18% of adults from households earning less than $30,000 annually do not use the Internet.
Population: All U.S. adults.
Sample: 1502 adults surveyed.
Descriptive: "18% of adults from households earning less than $30,000 do not use the Internet."
Inferential: The Internet may be less accessible to lower-income households.
Chapter 1.2: Data Classification
Objectives
Distinguish between qualitative and quantitative data.
Classify data by the four levels of measurement: nominal, ordinal, interval, and ratio.
Types of Data
Qualitative Data: Consists of attributes, labels, or nonnumerical entries.
Examples: Major, place of birth, eye color.
Quantitative Data: Consists of numerical measurements or counts.
Examples: Age, weight, temperature.
Example: Classifying Data by Type
The following table shows sports-related head injuries treated in U.S. emergency rooms:
Sport | Head injuries treated |
|---|---|
Basketball | 131,930 |
Baseball | 83,522 |
Football | 220,258 |
Gymnastics | 26,505 |
Hockey | 41,450 |
Soccer | 98,710 |
Softball | 41,216 |
Swimming | 43,815 |
Volleyball | 13,848 |
Qualitative Data: Types of sports (nonnumerical entries).
Quantitative Data: Number of head injuries (numerical entries).
Levels of Measurement
Nominal Level
Qualitative data only.
Data are categorized using names, labels, or qualities.
No mathematical computations can be made.
Ordinal Level
Qualitative or quantitative data.
Data can be arranged in order, or ranked.
Differences between data entries are not meaningful.
Interval Level
Quantitative data.
Data can be ordered.
Differences between data entries are meaningful.
Zero represents a position on a scale, not an inherent zero (zero does not mean "none").
Ratio Level
Quantitative data.
Similar to interval level, but zero entry is an inherent zero (implies "none").
Ratios of data values can be formed.
One data value can be expressed as a multiple of another.
Example: Classifying Data by Level
Data Set | Level of Measurement |
|---|---|
Top five U.S. occupations with the most job growth | Ordinal (can be ranked, but differences are not meaningful) |
Movie genres (Action, Adventure, Comedy, Drama, Horror) | Nominal (categories only, no order) |
Example: Interval vs. Ratio Level
Data Set | Level of Measurement |
|---|---|
New York Yankees' World Series victories (years) | Interval (differences are meaningful, but no true zero) |
2020 American League home run totals (by team) | Ratio (true zero exists, ratios are meaningful) |
Summary Table: Four Levels of Measurement
Level of Measurement | Put data in categories | Arrange data in order | Subtract data values | Determine if one data value is a multiple of another |
|---|---|---|---|---|
Nominal | Yes | No | No | No |
Ordinal | Yes | Yes | No | No |
Interval | Yes | Yes | Yes | No |
Ratio | Yes | Yes | Yes | Yes |
Examples of Data Sets and Calculations
Level | Example of a Data Set | Meaningful Calculations |
|---|---|---|
Nominal | Types of shows televised by a network (Comedy, Drama, Reality, etc.) | Put in a category only. No order or arithmetic operations. |
Ordinal | Movie ratings (G, PG, PG-13, R, NC-17) | Put in a category and order. Differences between ranks are not meaningful. |
Key Formulas and Notation
Population parameter: (mu) for mean, (sigma) for standard deviation.
Sample statistic: (x-bar) for mean, for standard deviation.
Summary
Statistics is the science of data collection, analysis, and interpretation.
Populations and samples are fundamental concepts; parameters describe populations, statistics describe samples.
Data can be qualitative or quantitative, and classified by four levels of measurement: nominal, ordinal, interval, and ratio.
Descriptive statistics summarize data; inferential statistics draw conclusions about populations from samples.