Analyze results and large p-values correctly
Company: Uber
Role: Data Scientist
Category: Statistics & Math
Difficulty: hard
Interview Round: Technical Screen
After the experiment ends, show exactly how you will analyze it: compute the intent‑to‑treat lift at the user assignment level with cluster‑robust standard errors; use CUPED or pre‑period covariates to reduce variance; correctly handle ratio metrics (delta method or Fieller) and skewed outcomes. Explain why session‑level analysis is problematic here (repeated measures per user, session counts correlated with treatment, non‑independence) and how to fix it (aggregate to user, mixed models, cluster‑robust SE). Handle non‑compliance/partial exposure (users who never opened) and estimate TOT via 2SLS using assignment as the instrument. If the p‑value is large, decide whether to: fail to reject vs claim no effect, run a post‑hoc power/MDE check, and/or run equivalence/non‑inferiority tests (TOST); optionally compare to a Bayesian posterior with a ROPE. Outline heterogeneity analysis and multiple‑testing control.
Quick Answer: This question evaluates a candidate's mastery of experimental analysis and applied causal inference, touching on intent-to-treat estimation, cluster-robust inference, variance reduction, handling of ratio metrics and skewed outcomes, non-compliance and instrumental approaches, decision frameworks for large p-values, and heterogeneity with multiple-testing control. Commonly asked in the Statistics & Math domain to assess practical application with conceptual understanding, it measures the ability to reason about appropriate analysis level, validity of inference, and interpretation of ambiguous results rather than just computational skill.