Chi-Square Calculator
Run a chi-square goodness-of-fit test, test of independence on a contingency table, or find a critical value — with a shaded distribution curve, color-coded contribution heatmap, and full step-by-step solutions.
Background
The chi-square (χ²) test is one of the most widely used statistical tests for categorical data. It answers two main questions: Does my sample fit the expected distribution? (goodness-of-fit) and Are two categorical variables independent of each other? (test of independence). In both cases, the test compares what you observed in your data against what you would expect if the null hypothesis were true, using the formula χ² = Σ (O − E)² / E. The larger the chi-square statistic, the more your data deviates from the null expectation.
How to use this calculator
- Goodness of Fit: enter your observed frequencies and expected frequencies (or equal proportions), set your significance level, and click Calculate to see χ², df, p-value, and whether to reject H₀.
- Test of Independence: enter your contingency table of observed counts row by row. The calculator computes expected values, χ², df, and p-value automatically.
- Critical Value / p-value: enter df and α to get the critical value, or enter a χ² statistic and df to get the p-value directly.
- Use the quick example chips to instantly load classic textbook scenarios.
How the chi-square test works
The core formula. For every category or cell, compute (O − E)² / E where O is the observed count and E is the expected count. Sum these up across all categories to get χ².
Degrees of freedom. For a goodness-of-fit test with k categories: df = k − 1. For a test of independence with an r × c table: df = (r − 1)(c − 1).
Expected values for independence tests. For each cell in row i, column j: E = (row total × column total) / grand total.
The p-value. Using the chi-square distribution with the computed df, the p-value is the probability of getting a χ² statistic this large or larger if H₀ is true. If p ≤ α, reject H₀.
Assumption check. Expected cell counts should generally be at least 5 in each cell. If any expected count falls below 5, the chi-square approximation may be unreliable — this calculator flags that automatically.
Formula & Equations Used
Chi-square statistic: χ² = Σ (O − E)² / E
Goodness-of-fit degrees of freedom: df = k − 1 (k = number of categories)
Independence test degrees of freedom: df = (r − 1)(c − 1)
Expected frequency (independence): E_ij = (Row_i total × Col_j total) / Grand total
p-value: P(χ² ≥ χ²_observed | df) from the chi-square distribution
Example Problems & Step-by-Step Solutions
Example 1 — Goodness of fit: fair die
A die is rolled 60 times. Observed: [8, 12, 9, 14, 11, 6]. Is it fair?
Step 1: Expected = 60/6 = 10 per face. df = 6 − 1 = 5.
Step 2: χ² = (8−10)²/10 + (12−10)²/10 + ... = 0.4+0.4+0.1+1.6+0.1+1.6 = 4.2
Step 3: p-value ≈ 0.521. Since p > 0.05, fail to reject H₀ — no evidence the die is unfair.
Example 2 — Test of independence: gender vs. preference
Survey of 100 people: do men and women prefer different beverages?
Step 1: Build the contingency table and compute row/column totals.
Step 2: For each cell, E = (row total × col total) / 100.
Step 3: Compute (O−E)²/E for every cell, sum to get χ². df = (rows−1)(cols−1).
Step 4: Compare p-value to α = 0.05 to decide.
Example 3 — Low expected count warning
If any cell has an expected count below 5, the chi-square approximation becomes unreliable.
Why it matters: The χ² distribution is a continuous approximation to a discrete count distribution. When expected counts are small, this approximation breaks down and p-values become inaccurate.
Fix: Combine low-frequency categories, collect more data, or use Fisher's Exact Test instead.
Example 4 — Reading a chi-square table
With df = 4 and α = 0.05, the critical value is χ²_crit = 9.488.
Decision rule: If your computed χ² > 9.488, reject H₀. If χ² ≤ 9.488, fail to reject H₀.
Equivalently: if p-value < 0.05, reject H₀. Both approaches always give the same decision.
Frequently Asked Questions
What is the null hypothesis in a chi-square test?
For a goodness-of-fit test, H₀ states that the population follows the specified distribution. For a test of independence, H₀ states that the two categorical variables are independent of each other — knowing one variable's value tells you nothing about the other.
Why does chi-square only test for significant difference, not direction?
Because chi-square squares the differences (O − E)², making all deviations positive regardless of direction. The test tells you whether your observed data is surprisingly different from expected, but not which categories are higher or lower than expected — you need to inspect the residuals for that.
What does "degrees of freedom" mean in this context?
Degrees of freedom represent the number of values in the calculation that are free to vary. For a goodness-of-fit test with k categories, once you know k−1 of the deviations, the last one is determined (they must sum to zero), so df = k−1. For an r×c table, df = (r−1)(c−1) for the same reason applied to both dimensions.
What's the difference between goodness-of-fit and test of independence?
Goodness-of-fit tests one variable against a known or theoretical distribution (e.g., is this die fair?). Test of independence tests whether two categorical variables measured on the same subjects are related (e.g., is smoking related to gender?). Both use the same χ² formula but differ in how expected values are computed and how df is calculated.
Can chi-square be used for continuous data?
No — chi-square requires categorical (count) data. If you have continuous measurements, you would need to bin them into categories first, though this loses information. For comparing means of continuous variables, t-tests or ANOVA are more appropriate.
Why must expected counts be at least 5?
The chi-square distribution is a continuous approximation. When expected cell counts are very small, the discrete count data deviates too much from this continuous approximation, making the p-value inaccurate. The rule of thumb of E ≥ 5 per cell ensures the approximation is reliable enough for practical use.