Design an experiment for delay drivers
Company: Capital One
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Onsite
Observational modeling of flight delays shows high VIF among operational drivers (e.g., gate turnaround, taxi-out time, airport busyness), making causal interpretation dubious.
Design an empirical strategy to estimate the causal effect of reducing gate turnaround time on P(delay>15min).
Tasks:
1) Randomized design: Propose a cluster-randomized experiment (by airport×day block). Specify: unit of randomization, blocking/stratification, primary outcome, guardrails, and how you’ll mitigate interference/spillovers.
2) Powering: With baseline delay rate 24%, MDE = 2 pp, alpha=0.05, power=0.8, average 200 flights per cluster, ICC=0.15—compute the design effect and required clusters per arm (show formulas; an approximate numeric answer is acceptable).
3) If randomization is infeasible, outline a credible quasi-experiment (DiD with staggered rollout or IV). State identifying assumptions, pre-trend checks, and robustness tests.
4) Analysis plan: Pre-register estimands (ATE, CATE by airport tier), adjustments for multiplicity, and a missing-data plan. Include a plan for heterogeneous effects and operationalization into policy.
Quick Answer: This question evaluates experimental design and causal inference skills — including randomized and quasi-experimental identification, power and sample-size reasoning, handling interference/spillovers, and pre-registered estimands — within an operational analytics context.