Dharmateja Priyadarshi Uddandarao, Senior Statistician – Data Scientist, Amazon

Propensity Score Matching (PSM)


Target-Metric Pre-Balancing

[Y∣T=1]−E[Y∣T=0]=(E[Y(1)∣T=1]−E[Y(0)∣T=1])+(E[Y(0)∣T=1]−E[Y(0)∣T=0])[Y∣T=1]-E[Y∣T=0]=(E[Y(1)∣T=1]-E[Y(0)∣T=1]) +(E[Y(0)∣T=1]-E[Y(0)∣T=0])

Example: Recommendation Engine in a Fashion App

Applying PSM with Pre-Balancing



In Figure 1 we visualize this matching effect. The left plot shows the distribution of baseline purchase propensity for treated (orange) vs control (blue) groups before matching: clearly the treated curve is shifted toward higher values. After matching on the target metric (right plot), the orange and blue histograms almost perfectly overlap. This overlap confirms that treated and control groups are now “level” in terms of prior behavior. Because of this pre-balancing, any remaining difference in purchase rates can be more credibly attributed to the recommendation feature itself.
Practical Takeaways

Conclusion

References
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55. https://doi.org/10.1093/biomet/70.1.41
Imbens, G. W., & Rubin, D. B. (2015). Causal inference for statistics, social, and biomedical sciences: An introduction. Cambridge University Press. https://doi.org/10.1017/CBO9781139025751
Athey, S., & Imbens, G. W. (2016). Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences, 113(27), 7353–7360. https://doi.org/10.1073/pnas.1510489113
Gordon, B. R., Zettelmeyer, F., Bhargava, N., & Chapsky, D. (2019). A comparison of approaches to advertising measurement: Evidence from big field experiments at Facebook. Marketing Science, 38(2), 193–225. https://doi.org/10.1287/mksc.2018.1125
Johansson, F. D., Shalit, U., & Sontag, D. (2016). Learning representations for counterfactual inference. Proceedings of the 33rd International Conference on Machine Learning (ICML), 3020–3029.