Skip to main content
Back

Descriptive Statistics: Frequency Distributions, Graphs, and Measures of Central Tendency

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Descriptive Statistics

Introduction

Descriptive statistics are methods for summarizing and organizing data so that patterns and key features can be easily understood. This chapter covers frequency distributions, graphical representations of data, and measures of central tendency, which are foundational concepts in statistics.

Frequency Distributions

Definition and Construction

A frequency distribution is a table that displays classes (or intervals) of data entries with a count of the number of entries in each class. The frequency of a class is the number of data entries that fall into that class. Frequency distributions help organize data and reveal patterns.

  • Class Limits: The smallest and largest data values that can belong to a class.

  • Class Width: The difference between the upper and lower class limits. Calculated as: (rounded up to the nearest whole number)

  • Lower Class Limit (LCL): The smallest value in a class.

  • Upper Class Limit (UCL): The largest value in a class.

  • Class Boundaries: Numbers that separate classes without forming gaps between them, found by averaging consecutive class limits.

Steps to Construct a Frequency Distribution

  1. Find the range:

  2. Determine the class width.

  3. Find the lower class limits.

  4. Find the upper class limits.

  5. Tally the data into classes.

  6. Count the frequency for each class.

  7. Find the midpoint:

  8. Calculate relative frequency: , where is the class frequency and is the total number of data entries.

  9. Find cumulative frequency: The sum of frequencies for the current and all previous classes.

Example Frequency Distribution Table

Class

Frequency

Midpoint

Relative Frequency

Cumulative Frequency

14–19

1

16.5

0.07

1

20–25

4

22.5

0.27

5

26–31

3

28.5

0.20

8

32–37

3

34.5

0.20

11

38–43

2

40.5

0.13

13

44–49

1

46.5

0.07

14

50–55

2

52.5

0.13

16

Additional info: Table values inferred for illustration.

Graphical Representation of Data

Histograms

A histogram is a bar graph that represents the frequency distribution of a data set. The horizontal axis shows the classes, and the vertical axis shows the frequencies. Bars must touch, indicating continuous data.

  • Relative Frequency Histogram: The vertical axis shows relative frequencies instead of raw counts.

  • Class Boundaries: Used to avoid gaps between bars.

Frequency Polygon

A frequency polygon is a graph that uses line segments to connect points plotted at the class midpoints and frequencies. It emphasizes the continuous change in frequencies.

Stem-and-Leaf Plot

A stem-and-leaf plot separates each number into a stem (all but the last digit) and a leaf (the last digit). It retains the original data values and is useful for small data sets.

  • Example: For the data set 82, 85, 95, 91, 73, 70, 75, 78, 82, 97, 59, 65, 89, 72, 71, 67, 55, 94, 91, 80, 54, 73, 51, 80, 77, 92, 66, 50, 75, 76, 82, 90, 81, 53, 78, 74, 78, 81, 85, 76, the stem-and-leaf plot would organize values by tens (stems) and units (leaves).

Dot Plot

A dot plot displays each data entry as a point above a horizontal axis. It is useful for visualizing frequency and distribution for small data sets.

Pie Chart

A pie chart presents qualitative data graphically as percents of a whole. Each sector's area is proportional to the frequency of each category.

Pareto Chart

A Pareto chart is a bar graph for qualitative data, with bars arranged in order of decreasing height. It is used to highlight the most significant categories.

Scatter Plot

A scatter plot displays ordered pairs as points in a coordinate plane, showing the relationship between two quantitative variables. It is useful for identifying correlations.

  • Example: Fisher's Iris data set plots petal length vs. petal width for different species of iris.

Time Series

A time series consists of quantitative entries taken at regular intervals over time. It is used to analyze trends and patterns.

Year

Degrees (thousands)

2012

93.1

2013

98.7

2014

103.0

2015

109.0

2016

115.1

2017

123.9

2018

133.8

2019

140.7

Measures of Central Tendency

Mean, Median, and Mode

Measures of central tendency describe the center of a data set. The most common are the mean, median, and mode.

  • Population Mean: Where is the population mean, is the sum of all values, and is the number of values in the population.

  • Sample Mean: Where is the sample mean, is the sum of all sample values, and is the sample size.

  • Median: The middle value when data is ordered. If even number of entries, median is the mean of the two middle values.

  • Mode: The value(s) that occur most frequently. A data set may have no mode, one mode, or multiple modes.

Example Calculation

  • Data: 75, 67, 80, 76, 84, 90, 89, 75, 80, 83, 89, 62, 79, 81, 78

  • Mean:

  • Median: Arrange data in order and find the middle value.

  • Mode: Identify the value(s) that appear most often.

Outliers

An outlier is a data entry that is far removed from other entries. Outliers can greatly affect the mean, making it less representative of the data set's center.

Misleading Graphs

Graphical Integrity

Graphs can be misleading if scales are manipulated or if visual representations exaggerate differences. Always check axis scales and graphical proportions.

  • Example: Bar heights may exaggerate differences if the vertical axis does not start at zero.

  • Example: 3D effects or disproportionate images can distort perception of data.

Summary Table: Types of Graphs and Their Uses

Graph Type

Data Type

Main Purpose

Histogram

Quantitative

Show frequency distribution

Frequency Polygon

Quantitative

Show continuous change in frequency

Stem-and-Leaf Plot

Quantitative

Retain original data values

Dot Plot

Quantitative

Visualize frequency for small data sets

Pie Chart

Qualitative

Show proportions of categories

Pareto Chart

Qualitative

Highlight most significant categories

Scatter Plot

Quantitative (paired)

Show relationship between variables

Time Series

Quantitative (over time)

Analyze trends and patterns

Key Takeaways

  • Descriptive statistics organize and summarize data for easier interpretation.

  • Frequency distributions and graphs reveal patterns and relationships in data.

  • Measures of central tendency (mean, median, mode) describe the center of a data set.

  • Outliers and misleading graphs can distort statistical conclusions.

Pearson Logo

Study Prep