Design and evaluate a dasher bike rollout
Company: TikTok
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Technical Screen
DoorDash (DD) plans to relaunch an opt-in program letting existing car dashers also sign up to use their own bicycles/e-bikes for deliveries while retaining access to car mode. You must decide whether to scale this program next quarter. Answer precisely: 1) State the business goal and a falsifiable primary hypothesis (e.g., reduce median delivery ETA in dense zones without reducing dasher earnings/hour). List at least two secondary goals (e.g., cost per order, supply density). 2) Define one primary success metric, at least three guardrails, and two driver metrics. For each, specify exact definitions (numerator/denominator), attribution window, unit of analysis, exclusions (e.g., batched orders, outliers), and directionality. 3) Choose a randomization unit—dasher-level, zone-day-level, or order-level—and justify it by discussing noncompliance (opt-in, vehicle switching), spillovers/interference (supply rebalancing, congestion), inventory constraints (e.g., only 600 bikes for 1,200 interested dashers across 5 cities), and operational feasibility. If opt-in is required, design an encouragement RCT (invite vs no-invite) and explain ITT vs TOT; specify how you’ll instrument actual bike usage. 4) Outline your sample-size/power plan: baseline assumptions, target MDE for the primary metric, clustering/ICC impact, seasonality/day-of-week controls, and how you’ll handle unequal cluster sizes. No detailed math is needed, but be concrete about inputs you’d request. 5) Pre-rollout plan: instrumentation to infer actual vehicle per trip, data QA checks, eligibility rules, safety/compliance, and a canary+ramp schedule with explicit stop/go thresholds. 6) Analysis plan: handle heterogeneous effects (hills/elevation, weather, time-of-day, restaurant wait), contamination/crossovers, and selection bias from opt-in. Describe how you’d separate speed-of-travel effects from restaurant latency. 7) Decision framework: specify success criteria that trigger scale-up vs rollback and how you’d monitor post-launch with long-term holdouts.
Quick Answer: This question evaluates experimental design, program evaluation, causal inference, metric specification, and operational analytics competencies for marketplace platforms and belongs to the Analytics & Experimentation domain.