Chi-Square Test Explained
Chi-square tests analyze relationships between categorical variables. Use them to test independence, goodness of fit, and whether observed frequencies match expected distributions.
Chi-Square Test of Independence
Question: Are Two Variables Related?
from scipy.stats import chi2_contingency
import pandas as pd
# Example: Is gender related to product preference?
data = pd.DataFrame({
'Gender': ['M', 'M', 'M', 'F', 'F', 'F'] * 50,
'Product': ['A', 'B', 'C', 'A', 'B', 'C'] * 50
})
# Create contingency table
contingency_table = pd.crosstab(data['Gender'], data['Product'])
print(contingency_table)
# Product
# Gender A B C
# F 50 50 50
# M 50 50 50
Run the Test
# Chi-square test
chi2, p_value, dof, expected = chi2_contingency(contingency_table)
print(f"Chi-square statistic: {chi2:.3f}")
print(f"P-value: {p_value:.3f}")
print(f"Degrees of freedom: {dof}")
print("\nExpected frequencies:")
print(expected)
# Interpretation
alpha = 0.05
if p_value < alpha:
print("✓ Variables are related (dependent)")
else:
print("✗ Variables are independent")
Real Example: Marketing Channel Performance
import numpy as np
# Observed data
observed = pd.DataFrame({
'Email': [45, 55],
'Social': [30, 70],
'Search': [60, 40]
}, index=['Converted', 'Not Converted'])
print("Observed frequencies:")
print(observed)
# Email Social Search
# Converted 45 30 60
# Not Converted 55 70 40
# Test if conversion rate differs by channel
chi2, p, dof, expected = chi2_contingency(observed)
print(f"\nP-value: {p:.4f}")
if p < 0.05:
print("Channels have significantly different conversion rates!")
else:
print("No significant difference between channels")
Chi-Square Goodness of Fit
Question: Does Data Match Expected Distribution?
from scipy.stats import chisquare
# Example: Are die rolls fair?
observed_rolls = [12, 15, 18, 14, 16, 15] # Counts for faces 1-6
expected_rolls = [15, 15, 15, 15, 15, 15] # Expected for fair die
chi2, p_value = chisquare(observed_rolls, expected_rolls)
print(f"Chi-square: {chi2:.3f}")
print(f"P-value: {p_value:.3f}")
if p_value < 0.05:
print("Die is NOT fair")
else:
print("Die appears fair")
Practical Example: Survey Analysis
# Survey: Is satisfaction related to age group?
survey_data = {
'18-25': [20, 30, 10], # [Satisfied, Neutral, Dissatisfied]
'26-40': [45, 35, 20],
'41-60': [55, 25, 20],
'60+': [30, 15, 5]
}
# Create DataFrame
df = pd.DataFrame(survey_data,
index=['Satisfied', 'Neutral', 'Dissatisfied'])
print(df)
# Run test
chi2, p, dof, expected = chi2_contingency(df)
print(f"\nChi-square: {chi2:.2f}")
print(f"P-value: {p:.4f}")
if p < 0.05:
print("✓ Satisfaction significantly differs across age groups")
else:
print("✗ No significant difference")
Calculating Effect Size (Cramér's V)
import numpy as np
def cramers_v(contingency_table):
chi2, p, dof, expected = chi2_contingency(contingency_table)
n = contingency_table.sum().sum()
min_dim = min(contingency_table.shape) - 1
return np.sqrt(chi2 / (n * min_dim))
# Calculate effect size
v = cramers_v(observed)
print(f"Cramér's V: {v:.3f}")
# Interpretation:
# 0.0 - 0.1: Negligible
# 0.1 - 0.3: Weak
# 0.3 - 0.5: Moderate
# 0.5+: Strong
Assumptions and Requirements
- Independence: Observations must be independent
- Sample size: Expected frequency ≥ 5 in each cell
- Categorical data: Use counts, not percentages
- Mutually exclusive: Each observation in exactly one category
When Expected Counts are Low
# Fisher's Exact Test for small samples
from scipy.stats import fisher_exact
# 2x2 table with small counts
table = [[3, 7],
[8, 2]]
odds_ratio, p_value = fisher_exact(table)
print(f"P-value: {p_value:.3f}")
# More reliable than chi-square for small samples
Visualizing Results
import seaborn as sns
import matplotlib.pyplot as plt
# Heatmap of contingency table
sns.heatmap(contingency_table, annot=True, fmt='d', cmap='Blues')
plt.title('Gender vs Product Preference')
plt.show()
# Bar plot comparing observed vs expected
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
observed.plot(kind='bar', ax=ax1, title='Observed')
pd.DataFrame(expected).plot(kind='bar', ax=ax2, title='Expected')
plt.tight_layout()
Common Use Cases
- A/B testing with categorical outcomes
- Survey analysis (demographics vs responses)
- Quality control (defect rates by production line)
- Medical studies (treatment vs outcome)
- Market research (customer segment vs preferences)
Pro Tip: Always check that expected frequencies are ≥5 in each cell. For 2x2 tables with small samples, use Fisher's Exact Test instead. Report effect size (Cramér's V) alongside p-value!
← Back to Data Analysis Tips