Evaluate a new product with experimentation
Company: Stripe
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Technical Screen
A new recommendation module may cause cross‑user interference and traffic seasonality. Design an evaluation plan. (1) Define an Overall Evaluation Criterion (OEC) for a commerce app and 3 guardrails (e.g., churn, latency p95, complaint rate) with precise formulas and units. (2) Choose a test design (user‑level RCT, geo‑cluster, or time‑based switchback) and justify against interference, non‑stationarity, and operational constraints. (3) Describe ramp strategy and pre‑registration: stopping rules, power target, variance reduction (CUPED/covariate adjustment), and small‑area risk controls. (4) If randomization is infeasible, propose a quasi‑experimental fallback (synthetic control or difference‑in‑differences) and list the assumptions and falsification tests you will run. (5) Suppose mid‑test the OEC flatlines while add‑to‑cart rises and conversion falls; provide a metric‑debugging checklist and the exact cuts you will request to localize the issue (e.g., by device, geography, new vs returning, latency buckets). Be specific and write the equations where relevant.
Quick Answer: This question evaluates experimental design and causal inference skills, including defining an Overall Evaluation Criterion (OEC) and guardrail metrics, selecting an appropriate test design and ramp/power strategy, specifying quasi-experimental fallbacks, and performing metric diagnostics for a recommendation module in a commerce app.