Skip to main content
Back

Introduction to Statistics: Key Concepts and Sampling Methods

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Chapter 1: Stats Starts Here

Introduction to Statistics

Statistics is the science of collecting, analyzing, interpreting, and presenting data. It provides methods for making informed decisions in the presence of uncertainty and variation.

Key Definitions

  • Data: Collections of observations such as measurements, grades, or survey responses. Data is plural for datum.

  • Statistic: A calculation carried out on a set of data, such as the average of a data set.

  • Population: The entire group of subjects that we are interested in studying.

  • Census: The collection of data from every member of the population.

  • Sample: A subcollection of members from a population.

Example: If we are interested in the study habits of Douglas College students and survey 200 students, the sample is the 200 surveyed students, and the population is all Douglas College students.

Subjects, Respondents, and Cases

  • Subjects: Individuals who participate in a study or experiment.

  • Respondents: Individuals who answer a survey.

  • Experimental units: Non-human subjects such as animals, plants, or objects.

  • Cases: The most generic term for the units or subjects in a dataset.

  • Variables: Specific data categories collected for each subject.

Parameter vs. Statistic

  • Parameter (Population Parameter): A numerical measurement describing a characteristic of a population.

  • Statistic (Sample Statistic): A numerical measurement describing a characteristic of a sample.

Note: Population parameters are typically unknown and must be estimated using sample statistics.

Statistical Questions and Significance

  • Statistical Question: A question that can be answered by collecting data and where variability is expected in the data.

  • Statistical Significance: A result is statistically significant if it is unlikely to occur by chance, even after accounting for sample variation.

Example: If a sample of 200 women and 150 men is surveyed and 41 women and 30 men smoke, we use these sample statistics to infer about the population proportions and test if the rates of smoking are different.

Types of Variables and Data

Quantitative vs. Qualitative Variables

  • Quantitative (Numerical) Variables: Variables that consist of numbers representing measurements or counts.

  • Qualitative (Categorical) Variables: Variables that consist of names or labels that are not measurements or counts.

Types of Quantitative Data

  • Discrete Data: Data for which the number of possible values is finite or countable (e.g., number of students in a class).

  • Continuous Data: Data for which the number of possible values is infinite and not countable (e.g., height, weight).

Types of Categorical Data

  • Nominal Data: Categorical data that does not have an order (e.g., gender, color).

  • Identifier Variables: Unique identifiers for individuals, such as student numbers. These are categorical but not used for analysis.

Note: Not all variables with numerical values are quantitative. For example, student numbers are categorical because they are labels, not measurements.

Sampling and Bias

Sampling Basics

  • Sample: A part of the whole population, selected for analysis.

  • Randomization: Ensures the sample is representative of the population.

  • Sample Size: The number of subjects in the sample; larger samples generally yield more reliable results.

Bias: Occurs when a portion of the population is over- or underrepresented by a sample.

Example: An online survey posted on a city website may introduce bias if only certain groups are likely to respond.

Sampling Frame

  • The sampling frame is the subset of the population that has a chance of being selected for the sample.

Simple Random Sample (SRS)

  • Every member of the population has an equal chance of being selected.

  • Random selection can be done using random number generators or drawing names from a hat.

Other Sampling Methods

  • Systematic Sampling: Select every k-th subject from a list after a random start.

  • Convenience Sampling: Sample subjects that are easiest to reach; often not representative.

  • Cluster Sampling: Divide the population into clusters, randomly select clusters, and sample all subjects within chosen clusters.

  • Multistage Sampling: Combine several sampling methods.

  • Voluntary Response Sampling: Individuals choose to participate; often leads to bias.

  • Undercoverage: Some groups in the population are left out of the sampling process.

Types of Bias

  • Nonresponse Bias: Not everyone selected responds; those who do may differ from those who do not.

  • Response Bias: Survey design or respondent behavior influences answers (e.g., wording of questions, desire to please interviewer).

Survey Design and Question Wording

Good Survey Practices

  • Ask specific, quantitative questions rather than general ones.

  • Phrase questions neutrally to avoid influencing responses.

  • Ensure questions are clear and address the information needed.

  • Consider who is being asked and whether they are the right respondents.

Example: "How many hours did you sleep last night?" is better than "How much do you sleep?"

Example of Question Wording Effect: Two similar questions about government wiretaps received different approval rates due to differences in wording, demonstrating the impact of question phrasing on survey results.

Summary Table: Types of Variables

Type

Description

Examples

Quantitative (Numerical)

Numbers representing measurements or counts

Height, weight, number of books

Qualitative (Categorical)

Names or labels, not measurements

Gender, color, student number

Discrete

Finite/countable values

Number of students in a class

Continuous

Infinite/not countable values

Height, time, temperature

Nominal

Categorical data with no order

Color, gender

Key Formulas

  • Sample Mean:

  • Population Mean:

  • Sample Proportion:

Additional info: This guide expands on the original notes by providing clear definitions, examples, and a summary table for variable types, as well as key formulas for basic statistics.

Pearson Logo

Study Prep