Skip to main content
Back

Introduction to Statistics: Key Concepts, Data Types, and Sampling Methods

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Introduction to Statistics

Overview of Statistics

Statistics is the science of planning studies and experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, and interpreting those data to draw conclusions. It is fundamental for making informed decisions in the presence of variability and uncertainty.

  • Statistics involves both the collection and analysis of data.

  • It is used in a wide range of fields, including business, healthcare, social sciences, and engineering.

Statistical and Critical Thinking

The Statistical Process

Conducting a statistical study involves three main steps: prepare, analyze, and conclude. Critical thinking is essential throughout this process to ensure meaningful and valid results.

  • Prepare: Define the context, identify the source of data, and determine the sampling method.

  • Analyze: Graph and explore the data, apply statistical methods, and summarize findings.

  • Conclude: Assess the significance of results and their practical implications.

Statistical thinking requires more than just performing calculations; it involves understanding the context, recognizing potential biases, and interpreting results appropriately.

Steps in Statistical Analysis

  • Context: What is the goal of the study? What does the data represent?

  • Source of Data: Is the data unbiased? Is there any special interest that could affect the results?

  • Sampling Method: Was the data collected in a way that avoids bias?

  • Graph the Data: Visualize the data to identify patterns or outliers.

  • Explore the Data: Summarize with key statistics (mean, median, standard deviation, etc.).

  • Apply Statistical Methods: Use appropriate techniques to analyze the data.

  • Significance: Are the results statistically and practically significant?

Populations, Samples, and Parameters

Population

A population is the complete collection of all measurements or data that are being considered. It is the group about which we want to draw conclusions.

  • Example: All human resource professionals in a country.

Sample

A sample is a subcollection of members selected from a population. Samples are used to make inferences about the population.

  • Example: 410 human resource professionals surveyed from the total population.

Parameter and Statistic

  • Parameter: A numerical measurement describing some characteristic of a population.

  • Statistic: A numerical measurement describing some characteristic of a sample.

  • Example: If 3.8% of 320 tested widgets failed quality control, 3.8% is a statistic. If all 5,210 widgets were tested, the percentage would be a parameter.

Census vs. Sample

  • Census: Data collected from every member of a population.

  • Sample: Data collected from a subset of the population.

Types of Data

Quantitative vs. Qualitative Data

  • Quantitative (Numerical) Data: Consist of numbers representing counts or measurements. Examples: Weights of supermodels, ages of respondents.

  • Qualitative (Categorical) Data: Consist of names, labels, or categories. Examples: Gender (male/female), survey responses (yes/no/maybe), shirt numbers (as labels).

Discrete vs. Continuous Data

  • Discrete Data: Quantitative data with a finite or countable number of possible values. Example: Number of coin tosses before getting heads.

  • Continuous Data: Quantitative data with infinitely many possible values, often measurements. Example: Lengths of distances from 0 cm to 12 cm.

Levels of Measurement

Data can be classified into four levels of measurement, which determine the types of statistical analyses that are appropriate.

Level

Description

Examples

Nominal

Categories/labels only; no order

Survey responses: yes, no, undecided

Ordinal

Categories with some order; differences not meaningful

Course grades: A, B, C, D, F

Interval

Ordered; differences meaningful; no natural zero

Years: 1000, 1776, 2000

Ratio

Ordered; differences and ratios meaningful; natural zero

Class times: 50 min, 100 min

Collecting Sample Data

Observational Studies vs. Experiments

  • Observational Study: Observe and measure specific characteristics without attempting to modify the subjects. Example: Studying the association between ice cream sales and drownings using past data (potential for confounding variables).

  • Experiment: Apply a treatment and observe its effects on subjects (experimental units). Example: Assigning one group to eat ice cream and another not, then comparing drowning rates.

Types of Observational Studies

  • Cross-sectional Study: Data collected at one point in time.

  • Retrospective (Case-Control) Study: Data collected from a past time period.

  • Prospective (Cohort) Study: Data collected in the future from groups sharing common factors.

Sampling Methods

Simple Random Sample

A sample of n subjects is selected so that every possible sample of the same size has the same chance of being chosen. This method helps ensure unbiased representation of the population.

Systematic Sampling

Randomly select a starting point and then select every kth element in the population. For example, after randomly choosing a starting name, select every 3rd name from a list.

Convenience Sampling

Use data that are easy to obtain, such as surveying members of your own class. This method is prone to bias and is generally not recommended for scientific studies.

Stratified Sampling

Divide the population into at least two different groups (strata) that share a characteristic, then draw a random sample from each group. This ensures representation from each subgroup.

Cluster Sampling

Divide the population into sections (clusters), randomly select some clusters, and then use all members from those selected clusters. This method is useful for large, geographically dispersed populations.

Summary Table: Sampling Methods

Sampling Method

Description

Example

Simple Random

Every sample of size n has equal chance

Randomly select 50 students from a university

Systematic

Select every k-th member after a random start

Every 10th person on a list

Convenience

Use easiest data to obtain

Surveying friends

Stratified

Random sample from each subgroup

Sample men and women separately

Cluster

Randomly select clusters, use all members

Randomly select classrooms, survey all students in them

Additional info: Understanding the differences between these sampling methods is crucial for designing valid studies and ensuring that results can be generalized to the population.

Pearson Logo

Study Prep