This question evaluates a candidate's proficiency in causal inference and Double Machine Learning for estimating average treatment effects from observational data, including representation and validation of address-derived text features, nuisance estimation, overlap diagnostics, sensitivity analysis, subgroup effect reporting, and geographic privacy/fairness concerns; it is categorized under Machine Learning and applied causal inference. It is commonly asked to gauge both conceptual understanding of orthogonalization, sample-splitting and identification assumptions and practical application skills in selecting and validating text/geospatial features, performing overlap/positivity checks, conducting sensitivity analyses, and controlling for multiple comparisons, representing a mix of conceptual understanding and practical implementation.
You have observational data on customer satisfaction (CSAT) surveys. Some customers received a first reminder to complete the survey; others did not. You want the Average Treatment Effect (ATE) of receiving the first reminder on CSAT, using Double Machine Learning (DML). Addresses are available as text and can be used to construct geospatial features and embeddings.
Assume: (a) treatment assignment is not randomized but depends on observed pre-treatment covariates; (b) CSAT is measured on a numeric scale (e.g., 1–5) for completed surveys; (c) all features used are measured or defined prior to the reminder decision.
Login required