A food-delivery company currently serves homepage store recommendations with ranking model V1.1. A new model V2.0 adds several new features and may require a different feature configuration for treatment users.
Design an experimentation and rollout plan for this model upgrade.
Your answer should address:
-
How to define the
primary success metric
and important
guardrail metrics
for a homepage recommendation model in a two-sided delivery marketplace.
-
How to choose the
unit of randomization
(for example, user-level, session-level, geo-level, or switchback/time-based) given that recommendations can affect merchant demand, delivery times, and marketplace balance.
-
How the serving infrastructure should support
experiment-specific model versions and feature-set configuration
, so control and treatment groups can fetch different feature lists safely.
-
What events and metadata must be logged so the experiment can be analyzed correctly.
-
How to handle practical issues such as
sample ratio mismatch, delayed conversions, feature missingness, novelty effects, selection bias, and spillover/interference
.
-
How to estimate
power / MDE
, and when methods such as stratification or
CUPED
would help.
-
What criteria you would use for
ramping, rollback, and final launch decisions
.