Help Me Understand X-Learners in 30 Seconds or Less
You want the effect of a treatment per person, not the average. But one group is tiny (few treated users, lots of control). T-learners choke. X-learners don't.
Three moves:
1. Fit two outcome models. μ₀ on controls, μ₁ on treated. Standard regression.
2. Impute the counterfactual. For each treated user, what would they have done untreated? Use μ₀. Subtract. That's their imputed effect. Do the mirror for controls with μ₁.
3. Fit two effect models and blend. One CATE model per group, combined with propensity weights — lean on whichever group has more data for that slice of X.
Why it wins: the small group borrows strength from the big group's model. Use it when treatment assignment is lopsided (most real experiments) and you care about heterogeneous effects, not just the average.