Prove causality for trading metric drop
Company: Robinhood
Role: Data Scientist
Category: Statistics & Math
Difficulty: hard
Interview Round: Technical Screen
You must separate market-driven fluctuations from a product-caused decline in executed_trades per active user.
1) Set up a difference-in-differences design around a 2025-07-10 release. Define: pre-window 2025-06-01–2025-07-09, post-window 2025-07-10–2025-08-21. Propose treatment and control groups (e.g., cohorts exposed vs not yet exposed; assets affected vs unaffected; geos rolling out later). Specify model equation, fixed effects, and clustered SEs. State the identifying assumptions (parallel trends, no spillovers) and exactly how you’ll test them. If pre-trends fail, describe a fix (e.g., synthetic control, matching + staggered DiD, or event study with leads/lags) and why it’s valid.
2) Detect structural breaks and quantify effect size using at least two methods: CUSUM or Bai–Perron change points and Bayesian Structural Time Series (BSTS). Explain how you’d reconcile effect sizes and uncertainty when methods disagree.
3) Compute the minimum detectable effect for a 10% decrease in executed_trades per active user with daily aggregation, α=0.05, power=0.80, mean active users/day=200,000, baseline mean=1.0 trades, SD=1.5 trades. Show formulas (pooled-variance two-sample t) and state the resulting sample-size or window-length needed.
4) Propose robustness checks: placebo dates, symbol-level randomization inference, wild bootstrap SEs, and sensitivity of results to volatility controls (e.g., VIX, SPX return) and holiday dummies. Define pass/fail criteria that would change your decision to ship a fix.
Quick Answer: This question evaluates a data scientist's competence in causal inference (difference-in-differences), time-series change-point detection (CUSUM/Bai–Perron and Bayesian Structural Time Series), statistical power/sample-size calculation, and robustness testing for attributing a metric decline to a product release.