This question evaluates a data scientist's competency in causal inference, time-series analysis, and synthetic control methodology for non-randomized experiments, including donor pool construction, pre/post diagnostics, and sensitivity and heterogeneity assessments.
You need to make a go/no-go decision for a high-impact feature that cannot be A/B tested due to policy/infrastructure constraints. Assume we can gate the feature to one or a small number of units (e.g., a geography × platform cell) and measure time-series KPIs across comparable units.
Propose a complete analysis plan using Synthetic Control (or justify an alternative if SCM is unsuitable). Address the following:
(a) Treatment unit and donor pool construction
(b) Pre-intervention window and external factors
(c) Metrics and decision rubric
(d) Predictors, weights, and tuning
(e) Diagnostics and inference
(f) Sensitivity and spillovers
(g) Heterogeneity and persistence
(h) Fallbacks
(i) Launch mapping
Login required