Evaluate a New Ads Recommendation Model via Online Experimentation
Scenario
You have trained a new ad-recommendation model and must decide whether it should replace the incumbent model that currently ranks/serves ads in a large-scale, auction-based ads system.
Task
Design an experiment to evaluate the new model against the incumbent and answer:
-
Experiment design: unit of randomization, traffic split/ramp, duration, and how to handle auction/marketplace interference.
-
Metrics: define primary success metric(s) and guardrails (revenue, CTR, ROI, user retention, latency, etc.).
-
Stakeholder readouts: how the story and metrics differ for a CFO vs a CGO (growth).
-
Decision framework: launch or rollback criteria, with monitoring and risk mitigation.
Requirements
-
Cover A/B design, traffic allocation, expected duration, statistical significance/power, and trade-offs between finance and growth.
-
Include clear metric definitions and assumptions where needed.