Estimate Treatment Effects Using PSM, DiD, and DML Methods
Causal Impact of Marketing Campaigns: PSM, DiD, Synthetic Control, and DML
Scenario
You have observational data from a marketing campaign where some users/regions were exposed to a campaign (treatment) and others were not (control). You also have outcomes measured before and after the campaign for each unit.
Tasks
-
Explain how you would use Propensity Score Matching (PSM) to estimate the treatment effect. Specify assumptions, how you would check overlap and balance, and what robustness checks you would run.
-
Explain how you would use Difference-in-Differences (DiD) to estimate the treatment effect. State identification assumptions (e.g., parallel trends), how you would implement it in practice, and robustness checks.
-
When is a Synthetic Control Method preferable to DiD? Provide the intuition and the key conditions that make it stronger than standard DiD.
-
Describe the Double Machine Learning (DML) framework for causal inference, focusing on why it is useful with high-dimensional covariates. Include the role of cross-fitting and orthogonalization, and the required assumptions.
Constraints & Assumptions
-
Preserve the scope, facts, inputs, and requested outputs from the prompt above.
-
If the prompt leaves a detail unspecified, state a reasonable assumption before relying on it.
-
Keep the answer interview-ready: concise enough to present, but concrete enough to implement or evaluate.
Clarifying Questions to Ask
-
Clarify the random variables, distributional assumptions, independence assumptions, and desired output.
-
Show enough derivation for the interviewer to follow the reasoning.
-
Explain how you would validate the result with simulation or sensitivity checks.
What a Strong Answer Covers
-
A correct setup with definitions, formulas, and boundary conditions.
-
A step-by-step derivation or estimation plan.
-
Interpretation of the result, including uncertainty and practical limitations.
-
Checks for assumptions, edge cases, and numerical stability.
Follow-up Questions
-
How would the result change if the assumptions were relaxed?
-
Can you verify the answer with a simulation?
-
What is the most likely source of estimation error?