Useful Data Tips

5 Descriptive Statistics You Should Always Check First

⏱️ 35 sec read 📊 Data Analysis

Before diving into complex analysis, always start with these 5 statistics. They reveal data quality issues, outliers, and distribution shape.

1. Mean vs Median - Detect Skewness

Mean: Average of all values (sensitive to outliers)

Median: Middle value (robust to outliers)

# Python
df['salary'].mean()   # 75,000
df['salary'].median() # 58,000

What this tells you:

2. Standard Deviation - Measure Spread

How much do values vary from the mean?

# Python
df['age'].std()  # 12.5

# SQL
SELECT STDDEV(age) FROM users;

Rules of thumb:

High std dev = lots of variability. Low std dev = values clustered near mean.

3. Min and Max - Spot Data Issues

df['age'].min()  # 0  ← Suspicious! Baby customers?
df['age'].max()  # 150 ← Data entry error!

Check for:

4. Percentiles (25th, 75th) - Understand Distribution

Quartiles split data into 4 equal parts:

# Python
df['price'].quantile([0.25, 0.5, 0.75])

# Result:
0.25    19.99  ← 25th percentile (Q1)
0.50    39.99  ← 50th percentile (median)
0.75    79.99  ← 75th percentile (Q3)

Interquartile Range (IQR) = Q3 - Q1

IQR tells you the range of the middle 50% of data (less affected by outliers than range).

5. Count and Missing Values

# Python
df['email'].count()        # 8,500
df['email'].isna().sum()   # 1,500 missing

# Percentage missing
df['email'].isna().mean() * 100  # 15% missing

Questions to ask:

Quick Analysis Pattern

# Python: Get all at once
df.describe()

# Output:
       age         salary       experience
count  10000.0    10000.0    10000.0
mean   35.5       75000.0    8.2
std    12.3       45000.0    5.1
min    18.0       25000.0    0.0
25%    27.0       45000.0    4.0
50%    34.0       58000.0    7.0
75%    43.0       95000.0    12.0
max    65.0       250000.0   30.0

Red Flags to Look For

Golden Rule: Never skip descriptive statistics. Five minutes of checking these stats can save hours of debugging bad analysis later.

← Back to Data Analysis Tips