Skip to main content
Back

Introduction to Statistics: Data Collection, Study Design, and Data Types

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Introduction to Statistics

What is Statistics?

Statistics is the science of collecting, analyzing, interpreting, presenting, and organizing data. It provides methods for making decisions and inferences about populations based on sample data.

  • Descriptive Statistics: Methods for summarizing and organizing data (e.g., tables, graphs, averages).

  • Inferential Statistics: Methods for making predictions or inferences about a population based on a sample.

  • Population vs. Sample: A population is the entire group of interest, while a sample is a subset of the population used for analysis.

Example: Estimating the average height of all college students by measuring a sample of 100 students.

Collecting Data: Sampling Methods

Sampling Techniques

Sampling is the process of selecting a subset of individuals from a population to estimate characteristics of the whole group. Proper sampling methods reduce bias and improve the reliability of statistical conclusions.

  • Simple Random Sampling: Every member of the population has an equal chance of being selected.

  • Convenience Sampling: Selecting individuals who are easiest to reach; may introduce bias.

  • Cluster Sampling: Dividing the population into clusters, then randomly selecting entire clusters.

  • Systematic Sampling: Selecting every k-th individual from a list after a random start.

  • Stratified Sampling: Dividing the population into strata (groups) and randomly sampling from each group.

Example: To survey student opinions, a university might use stratified sampling to ensure all majors are represented.

Types of Statistical Studies

Observational Studies vs. Experiments

Statistical studies can be classified based on how data is collected and whether variables are manipulated.

  • Observational Study: Researchers observe subjects without intervening. Useful for identifying associations but not causation.

  • Experiment: Researchers apply treatments and observe effects. Allows for conclusions about causality.

  • Surveys: A type of observational study where participants answer questions.

Blinding and Placebos: In experiments, blinding prevents subjects or researchers from knowing who receives the treatment, reducing bias. A placebo is an inactive treatment used as a control.

Example: Testing a new drug with a placebo group and a treatment group, using double-blind procedures.

Evaluating Statistical Studies

Guidelines for Assessing Plausibility

To determine if a statistical study is credible, consider the following guidelines:

  • Who conducted the study and why?

  • Is the sample representative of the population?

  • Were the measurements accurate and reliable?

  • Were confounding variables controlled?

  • Was the study randomized and blinded if appropriate?

  • Are the results statistically significant?

  • Are the conclusions justified by the data?

  • Is there evidence of bias or conflicts of interest?

Example: A study funded by a company selling a product may have potential bias.

Describing Data: Types and Measurement

Types of Data

Data can be classified based on their nature and measurement scale.

  • Qualitative (Categorical) Data: Describes qualities or categories (e.g., gender, color).

  • Quantitative (Numerical) Data: Represents counts or measurements (e.g., height, age).

  • Discrete Data: Countable values (e.g., number of students).

  • Continuous Data: Any value within a range (e.g., weight, temperature).

Example: Survey responses (yes/no) are categorical; test scores are numerical.

Dealing with Errors in Data Collection

Types of Errors

Errors can occur during data collection and measurement, affecting the accuracy of results.

  • Random Error: Unpredictable variations that occur by chance; can be minimized by increasing sample size.

  • Systematic Error (Bias): Consistent, repeatable error due to faulty equipment or flawed study design.

  • Measurement Error: Inaccuracies in recording data values.

Example: A miscalibrated scale introduces systematic error in weight measurements.

Percentages and Differences in Statistics

Using Percentages

Percentages are commonly used to describe proportions and compare quantities in statistics.

  • Percentage Formula:

  • Percentage Points: The simple arithmetic difference between two percentages.

  • Relative Change: The percentage increase or decrease from an original value.

Example: If 40% of students passed a test last year and 50% this year, the increase is 10 percentage points or a 25% relative increase.

Sampling Method

Description

Example

Simple Random

Every member has equal chance

Randomly select 50 students from a list

Convenience

Choose easiest to reach

Survey people in a cafeteria

Cluster

Randomly select groups, survey all in group

Randomly select 3 classes, survey all students in them

Systematic

Select every k-th individual

Survey every 10th person on a list

Stratified

Divide into groups, sample from each

Sample 10 students from each major

Pearson Logo

Study Prep