Scenario
You are in a product-planning session and must define success criteria before development begins for a new change to the core booking funnel in a two‑sided marketplace app (guests booking stays from hosts).
Assume the feature materially affects the guest search/booking experience and may indirectly affect host outcomes (e.g., listing exposure, acceptance, cancellations). The goal is to plan how you would measure success and safely experiment.
Task
Design the success metrics, guardrails, and an experiment plan that you would propose to stakeholders.
Requirements
-
Success Metrics
-
Define a single primary north-star metric and supporting diagnostic metrics.
-
Explain why each metric is chosen and how it is computed (unit and window).
-
Guardrails
-
List key guardrail metrics and suggested non‑degradation thresholds.
-
Experiment Design
-
Choose a unit of randomization and justify it. Address interference concerns in a two‑sided marketplace and propose alternatives if needed.
-
Define eligibility, bucketing, and any stratification/blocking.
-
Sample Size and Power Analysis
-
State statistical assumptions (α, power) and baseline values.
-
Show how you would compute minimum sample size (or runtime) for the primary metric, including a small numeric example.
-
Note any variance reduction or clustering adjustments.
-
Runtime Monitoring and Data Quality
-
Specify what you would monitor daily/in real time, including SRM checks.
-
Describe ramp strategy, early stopping for harm, and logging/data-quality checks.
-
Analysis and Decision Criteria
-
Outline the analysis method, heterogeneity checks, and ship/rollback rules.
Hints
-
Differentiate clearly between north-star vs. diagnostic metrics.
-
Include power analysis details and assumptions.
-
Call out concrete data-quality checks you would run.