Apply instrumental variables under interference
Company: Uber
Role: Data Scientist
Category: Statistics & Math
Difficulty: hard
Interview Round: Technical Screen
Suppose a clean A/B test isn’t feasible for a new ride‑sharing feature due to interference. Propose an instrumental‑variables approach to estimate its causal effect on trip volume. State and justify all IV assumptions precisely—relevance, exclusion, independence (as‑if random), and monotonicity (if claiming LATE). Give at least two concrete, plausibly exogenous instruments (e.g., staggered driver app version eligibility, exogenous weather shocks affecting demand but not the feature directly) and write the first‑stage and second‑stage (2SLS) equations. Describe how you’ll diagnose weak instruments (first‑stage F‑stat), run over‑identification tests (Sargan/Hansen), handle clustering/heteroskedasticity, and assess violations of exclusion under marketplace spillovers. Would an effectively unlimited supply environment make the exclusion restriction more or less credible, and why? If assumptions partially fail, outline sensitivity analyses or bounds (e.g., Conley‑type).
Quick Answer: This question evaluates understanding of causal inference with instrumental variables in the presence of interference, testing skills in defining units and aggregation levels for market‑level spillovers, articulating IV assumptions (relevance, exclusion restriction, independence, monotonicity), and formulating estimation frameworks such as two‑stage least squares. Commonly asked in Statistics & Math interviews for data scientist roles because networked marketplaces invalidate simple A/B tests, it sits in the econometrics/causal inference domain and primarily assesses practical application of IV methods while requiring conceptual understanding of identification, robustness diagnostics, and sensitivity analysis.