Design metrics and A/B test for maps and ETA
Company: Uber
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: easy
Interview Round: Technical Screen
## Context
You work on Uber’s driver app. Drivers can navigate using either **Google Maps** or **Uber Maps**. Separately, Uber shows riders an **estimated time of arrival (ETA)** and you are considering changing the ETA model such that the displayed ETA becomes **shorter** on average.
## Part A — Map preference & map quality metrics
Design a **metrics framework** to evaluate:
1) Whether drivers **prefer** Google Maps or Uber Maps.
2) The **pros/cons** of Uber Maps vs Google Maps from both driver and marketplace perspectives.
Your answer should include:
- A clear definition of “preference” (behavioral vs stated).
- Primary metrics + diagnostic metrics + guardrails.
- Key sources of bias/confounding (e.g., self-selection into a map provider).
## Part B — If displayed rider ETA becomes shorter
1) What **marketplace and user behaviors** might change if the displayed ETA is systematically shorter?
2) Propose an **A/B test design** to measure the impact.
Your experiment plan should specify:
- Unit of randomization (rider, driver, trip, market).
- Primary success metrics, diagnostics, and guardrails.
- How you’d handle interference/network effects, seasonality, and heterogeneous treatment effects.
- Launch and monitoring plan (ramp, stop criteria).
Quick Answer: This question evaluates proficiency in metrics design, causal inference, and experimentation for product and marketplace features, specifically testing definition of behavioral versus stated preference, selection of primary/diagnostic/guardrail metrics, and identification of bias and confounding; it falls under the Analytics & Experimentation domain for Data Science roles. It is commonly asked because interviewers need assurance that the candidate can reason about marketplace dynamics, interference and seasonality, and plan practical experiment elements such as unit of randomization, diagnostics, monitoring and ramp/stop criteria, therefore testing both conceptual understanding (biases, preference concepts) and practical application (metric selection and experiment design).