Chi-Square Test Explained

⏱️ 27 sec read 📈 Data Analysis

Chi-square tests analyze relationships between categorical variables. Use them to test independence, goodness of fit, and whether observed frequencies match expected distributions.

Chi-Square Test of Independence

Question: Are Two Variables Related?

from scipy.stats import chi2_contingency
import pandas as pd

# Example: Is gender related to product preference?
data = pd.DataFrame({
    'Gender': ['M', 'M', 'M', 'F', 'F', 'F'] * 50,
    'Product': ['A', 'B', 'C', 'A', 'B', 'C'] * 50
})

# Create contingency table
contingency_table = pd.crosstab(data['Gender'], data['Product'])
print(contingency_table)

#         Product
# Gender   A   B   C
#      F  50  50  50
#      M  50  50  50

Run the Test

# Chi-square test
chi2, p_value, dof, expected = chi2_contingency(contingency_table)

print(f"Chi-square statistic: {chi2:.3f}")
print(f"P-value: {p_value:.3f}")
print(f"Degrees of freedom: {dof}")
print("\nExpected frequencies:")
print(expected)

# Interpretation
alpha = 0.05
if p_value < alpha:
    print("✓ Variables are related (dependent)")
else:
    print("✗ Variables are independent")

Real Example: Marketing Channel Performance

import numpy as np

# Observed data
observed = pd.DataFrame({
    'Email': [45, 55],
    'Social': [30, 70],
    'Search': [60, 40]
}, index=['Converted', 'Not Converted'])

print("Observed frequencies:")
print(observed)

#                Email  Social  Search
# Converted         45      30      60
# Not Converted     55      70      40

# Test if conversion rate differs by channel
chi2, p, dof, expected = chi2_contingency(observed)

print(f"\nP-value: {p:.4f}")
if p < 0.05:
    print("Channels have significantly different conversion rates!")
else:
    print("No significant difference between channels")

Chi-Square Goodness of Fit

Question: Does Data Match Expected Distribution?

from scipy.stats import chisquare

# Example: Are die rolls fair?
observed_rolls = [12, 15, 18, 14, 16, 15]  # Counts for faces 1-6
expected_rolls = [15, 15, 15, 15, 15, 15]  # Expected for fair die

chi2, p_value = chisquare(observed_rolls, expected_rolls)

print(f"Chi-square: {chi2:.3f}")
print(f"P-value: {p_value:.3f}")

if p_value < 0.05:
    print("Die is NOT fair")
else:
    print("Die appears fair")

Practical Example: Survey Analysis

# Survey: Is satisfaction related to age group?
survey_data = {
    '18-25': [20, 30, 10],  # [Satisfied, Neutral, Dissatisfied]
    '26-40': [45, 35, 20],
    '41-60': [55, 25, 20],
    '60+':   [30, 15, 5]
}

# Create DataFrame
df = pd.DataFrame(survey_data,
                 index=['Satisfied', 'Neutral', 'Dissatisfied'])

print(df)

# Run test
chi2, p, dof, expected = chi2_contingency(df)

print(f"\nChi-square: {chi2:.2f}")
print(f"P-value: {p:.4f}")

if p < 0.05:
    print("✓ Satisfaction significantly differs across age groups")
else:
    print("✗ No significant difference")

Calculating Effect Size (Cramér's V)

import numpy as np

def cramers_v(contingency_table):
    chi2, p, dof, expected = chi2_contingency(contingency_table)
    n = contingency_table.sum().sum()
    min_dim = min(contingency_table.shape) - 1
    return np.sqrt(chi2 / (n * min_dim))

# Calculate effect size
v = cramers_v(observed)
print(f"Cramér's V: {v:.3f}")

# Interpretation:
# 0.0 - 0.1: Negligible
# 0.1 - 0.3: Weak
# 0.3 - 0.5: Moderate
# 0.5+: Strong

Assumptions and Requirements

Independence: Observations must be independent
Sample size: Expected frequency ≥ 5 in each cell
Categorical data: Use counts, not percentages
Mutually exclusive: Each observation in exactly one category

When Expected Counts are Low

# Fisher's Exact Test for small samples
from scipy.stats import fisher_exact

# 2x2 table with small counts
table = [[3, 7],
         [8, 2]]

odds_ratio, p_value = fisher_exact(table)
print(f"P-value: {p_value:.3f}")

# More reliable than chi-square for small samples

Visualizing Results

import seaborn as sns
import matplotlib.pyplot as plt

# Heatmap of contingency table
sns.heatmap(contingency_table, annot=True, fmt='d', cmap='Blues')
plt.title('Gender vs Product Preference')
plt.show()

# Bar plot comparing observed vs expected
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
observed.plot(kind='bar', ax=ax1, title='Observed')
pd.DataFrame(expected).plot(kind='bar', ax=ax2, title='Expected')
plt.tight_layout()

Common Use Cases

A/B testing with categorical outcomes
Survey analysis (demographics vs responses)
Quality control (defect rates by production line)
Medical studies (treatment vs outcome)
Market research (customer segment vs preferences)

Pro Tip: Always check that expected frequencies are ≥5 in each cell. For 2x2 tables with small samples, use Fisher's Exact Test instead. Report effect size (Cramér's V) alongside p-value!

← Back to Data Analysis Tips