##### Scenario A feature has already been launched to 100% of traffic; no control or holdout group exists. ##### Question How would you estimate the causal impact of the launch? Outline possible methodologies, required data, assumptions, and how you would communicate uncertainty. ##### Hints Discuss pre-post analysis, synthetic controls, difference-in-differences, or propensity scoring; emphasize validation and sensitivity checks.

Evaluates causal impact estimation after a full product rollout with no holdout. Strong answers use interrupted time series, synthetic control, Bayesian structural time series, exposure variation, and placebo tests to build and validate a credible counterfactual.

How do I approach Analytics & Experimentation interview questions?

Analytics & Experimentation questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master analytics & experimentation interviews.

What difficulty level is this interview question?

This is a hard difficulty Analytics & Experimentation question, commonly asked during Technical Screen rounds at Airbnb.

What role is this question designed for?

This question is commonly asked for Data Scientist candidates at Airbnb during technical interviews.

Estimate Causal Impact Using Synthetic Control Methods

A product feature has already launched to 100% of traffic, and no explicit control or holdout group exists. You need to estimate causal impact on a business metric in a marketplace with seasonality and segment heterogeneity.

Constraints & Assumptions

Historical pre- and post-launch data are available.
Unit-level or geo-level panels may be available.
Exposure may vary even after a 100% rollout.
The analysis must build a credible counterfactual.

Clarifying Questions to Ask

Was rollout truly simultaneous, or did exposure vary by platform, market, app version, or eligibility?
What is the primary outcome and post-period horizon?
Are there major concurrent launches, marketing campaigns, or external shocks?
What historical units could serve as donors for a synthetic control?

Part 1 - Methods

What causal inference methodologies are suitable for this setting?

What This Part Should Cover

Interrupted time series, Bayesian structural time series, synthetic control, matched controls, difference-in-differences using exposure variation, and regression adjustment.
When each method is appropriate.

Part 2 - Required Data

What data does each method require?

What This Part Should Cover

Pre/post outcome history, untreated or less-exposed donor units, covariates, exposure logs, seasonality variables, and market or segment panels.
Data quality and stable metric definitions.

Part 3 - Assumptions and Validation

What identification assumptions must hold, and how would you validate them?

What This Part Should Cover

Good pre-period fit, no unmeasured concurrent shocks, stable relationship between treated and donor units, no spillovers, and comparable trends.
Placebo tests, backtesting, pre-trend checks, donor sensitivity, and falsification outcomes.

Part 4 - Uncertainty and Communication

How would you quantify and communicate uncertainty?

What This Part Should Cover

Confidence or credible intervals, placebo distributions, sensitivity analysis, scenario ranges, and limitations.
Clear recommendation despite uncertainty.

What a Strong Answer Covers

A strong answer does not pretend a 100% rollout is an experiment; it builds the best possible counterfactual, validates assumptions, and communicates uncertainty and caveats honestly.

Follow-up Questions

What if synthetic control has poor pre-period fit?
How would you handle seasonality and holidays?
How would you use exposure intensity after a 100% rollout?