BackDisplaying Descriptive Statistics: Tools and Techniques in Business Statistics
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Displaying Descriptive Statistics
Introduction
Descriptive statistics are essential tools in business statistics, enabling analysts to summarize, organize, and present data in meaningful ways. This chapter focuses on the various methods and technologies used to display and interpret both quantitative and qualitative data, with a particular emphasis on the use of Microsoft Excel for statistical analysis.
The Role Technology Plays in Statistics
Using Microsoft Excel for Statistical Analysis
Modern statistical analysis in business often relies on technology to efficiently process and visualize data. Microsoft Excel is a widely used software that offers built-in options for data presentation and statistical analysis.
Data Analysis ToolPak: An Excel add-in that provides advanced statistical analysis tools, such as descriptive statistics, histograms, and regression analysis.
To access these features, users may need to activate the Data Analysis ToolPak Add-in (especially in the Windows version of Excel).
Steps to Activate the Data Analysis ToolPak in Excel 2016:
Open Excel and click the File tab in the upper left corner.
Select Options from the left margin to open the Excel Options dialog.
Click Add-Ins in the left margin.
At the bottom of the window, select Go next to Manage: Excel Add-ins.
Check the boxes for Analysis ToolPak and Analysis ToolPak - VBA, then click OK.
Once activated, the Data Analysis tools can be accessed from the Data tab in Excel.
Displaying Quantitative Data
Types of Data
Quantitative Data: Numerical values that can be measured or counted (e.g., sales figures, weights).
Qualitative Data: Categorical values that describe characteristics or qualities (e.g., gender, product type).
Quantitative data is the focus of many descriptive statistics techniques, such as frequency distributions and histograms.
Constructing a Frequency Distribution
A frequency distribution shows the number of data observations that fall into specific intervals, called classes.
Each class represents a range of values.
Frequency distributions help identify patterns and trends in data.
Example: Number of iPads sold per day over 50 days.
Discrete vs. Continuous Data
Discrete Data: Values that can be counted and are typically whole numbers (e.g., number of products sold).
Continuous Data: Values that can take on any value within a range, including decimals (e.g., weight, time).
Relative Frequency Distributions
A relative frequency distribution displays the proportion of observations in each class relative to the total number of observations.
Calculated as:
The sum of all relative frequencies should equal 1.00.
Cumulative Relative Frequency Distributions
A cumulative relative frequency distribution shows the proportion of observations that are less than or equal to a given class boundary.
It accumulates the relative frequencies as you move from the lowest to the highest class.
The final cumulative relative frequency should be 1.00 (or very close, due to rounding).
Using a Histogram to Graph a Frequency Distribution
A histogram is a graphical representation of a frequency distribution, where each bar represents the frequency (or relative frequency) of a class interval (bin).
Excel refers to class intervals as "bins."
Histograms are useful for visualizing the shape and spread of data.
Constructing a Histogram in Excel
Enter your data and define the bin ranges.
Use the Data Analysis ToolPak to select "Histogram."
Specify the input range (data) and bin range (class boundaries).
Choose output options (e.g., new worksheet, chart output).
Format the histogram as needed (e.g., adjust bin width, remove unnecessary bins).
The Shape of Histograms
Symmetric: The right and left sides are mirror images.
Skewed: Data may be skewed left or right, indicating a longer tail on one side.
Grouped Quantitative Data
For large or continuous data sets, values are grouped into classes to simplify the frequency distribution.
The number of classes is typically between 4 and 20.
Use the rule to determine the number of classes, where is the number of classes and is the number of data points.
Class width is calculated as:
Rules for Classes in Grouped Data
All classes must be of equal width.
Classes must be mutually exclusive (no overlap).
All data values must be included.
Avoid empty or open-ended classes if possible.
The Ogive
An ogive is a line graph that plots the cumulative relative frequency distribution, providing a visual representation of the accumulation of data values.
Displaying Qualitative Data
Qualitative Data and Frequency Distributions
Qualitative data are categorical and can be nominal (no order) or ordinal (ordered categories). Frequency distributions for qualitative data show the number of occurrences in each category.
Excel's COUNTIF function can be used to count the number of values matching a category label.
Bar Charts
Bar charts are used to display qualitative data organized into categories. Bars can be vertical or horizontal.
Clustered Bar Chart: Groups several values side by side within the same category.
Stacked Bar Chart: Stacks values within the same category in a single bar.
Pareto Charts
A Pareto chart is a bar chart that displays the frequency of categories in descending order, often used to identify the most significant factors in quality control. It also includes a cumulative relative frequency line (ogive).
Pie Charts
Pie charts are used to compare proportions for categorical data, with each segment representing the relative frequency of a category. All categories must be included, and the chart is useful for visualizing the relative sizes of categories.
Contingency Tables
Introduction to Contingency Tables
Contingency tables (also known as cross-tabulations) display the frequencies of two qualitative variables, helping to identify relationships between them.
Constructed in Excel using Pivot Tables.
Rows and columns represent different categories of the two variables.
Example Table:
Payment Method | Under 30 | 30 and Over | Total |
|---|---|---|---|
Cash | 5 | 2 | 7 |
Credit | 7 | 5 | 12 |
Debit | 5 | 8 | 13 |
Total | 17 | 15 | 32 |
Additional info: Table values are inferred for illustration.
Stem and Leaf Display
Introduction to Stem and Leaf Displays
A stem and leaf display splits data values into stems (larger place values) and leaves (smaller place values), providing a graphical representation similar to a histogram but retaining the original data values.
Easy to construct by hand.
All original data points are visible.
Example: Exam scores: 78, 81, 85, 92, 95
Stem | Leaf |
|---|---|
7 | 8 |
8 | 1 5 |
9 | 2 5 |
Scatter Plots
Introduction to Scatter Plots
Scatter plots provide a visual representation of the relationship between two quantitative variables. The independent variable is plotted on the horizontal axis, and the dependent variable on the vertical axis.
Useful for identifying correlations and trends.
Each point represents a pair of values.
Line Charts and Sparklines
Line Chart: A scatter plot where data points are connected by line segments, often used for time series data.
Sparklines: Small, cell-sized charts in Excel that provide a compact visual summary of data trends.
Additional info: This guide covers the main methods for displaying descriptive statistics in business, with practical steps for using Excel as a tool for analysis and visualization.