Evaluate Propensity Score Matching Alternatives and Diagnostics
Context
You are reviewing an observational study that used Propensity Score Matching (PSM) to estimate the causal impact of a UI change on user watch time. Randomized experimentation was not feasible, so historical logs and user covariates were leveraged to construct a matched sample and estimate an ATT (average treatment effect on the treated).
Task
Answer the following about best practices in PSM for product analytics:
-
Why is standardized mean difference (SMD) ≤ 0.1 often used as a post-matching balance threshold?
-
If logistic regression is not appropriate for the propensity model, what alternatives would you consider and why?
-
How would you diagnose residual confounding after matching?
-
Describe one method to estimate the variance of the treatment effect under PSM and when it is appropriate.
Hint: Discuss overlap/positivity, balance diagnostics (including higher moments and transformations), causal estimands (ATE vs ATT), and robust variance estimation.
Constraints & Assumptions
-
Preserve the scope, facts, inputs, and requested outputs from the prompt above.
-
If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
-
Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.
Clarifying Questions to Ask
-
Clarify the random variables, distributional assumptions, independence assumptions, and desired output.
-
Show enough derivation for the interviewer to follow the reasoning.
-
Explain how you would validate the result with simulation or sensitivity checks.
What a Strong Answer Covers
-
A correct setup with definitions, formulas, and boundary conditions.
-
A step-by-step derivation or estimation plan.
-
Interpretation of the result, including uncertainty and practical limitations.
-
Checks for assumptions, edge cases, and numerical stability.
Follow-up Questions
-
How would the result change if the assumptions were relaxed?
-
Can you verify the answer with a simulation?
-
What is the most likely source of estimation error?