Design an experiment to evaluate a new ads algorithm
Company: Meta
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: Hard
Interview Round: Technical Screen
You are a Product Analytics/Data Science partner for an ads ranking/recommendation team. Facebook has shipped (or plans to ship) a **new ad recommendation algorithm** and believes it is better.
## Part A: How would you evaluate whether it is better?
Design an evaluation plan assuming you *can* run an experiment. In your answer:
- Define the **goal** and key stakeholders (advertisers, users, FB revenue).
- Propose a set of **metrics** with tradeoffs:
- **Primary metric** (choose one and justify)
- **Diagnostic metrics** to explain movement
- **Guardrail metrics** to prevent harm (user experience, platform integrity, advertiser outcomes)
- Choose the **unit of randomization** (e.g., user, session, advertiser) and explain interference risks (marketplace effects, repeated exposure, learning effects).
- Discuss basics of experiment execution: ramp plan, duration, power/MDE considerations, data quality checks (SRM), and how you would interpret heterogeneous impacts.
## Part B: What if 50/50 randomization is not feasible?
Sometimes a 50/50 split is not appropriate (e.g., risk, capacity limits, model learning/feedback loops, or advertiser delivery constraints). Propose **at least two** valid alternatives for randomization or rollout (e.g., unequal allocation, phased ramp, cluster randomization, switchback, geo split), and explain:
- When each approach is appropriate
- Key pitfalls (bias, interference, novelty effects)
- How analysis changes (e.g., variance, sequential monitoring, CUPED)
## Part C: If you cannot run a controlled test
If you are **not allowed** to run an online A/B test, how would you decide what to recommend to users and/or whether the new algorithm is better?
- Propose an approach using observational/offline data.
- Address confounding and selection bias.
- For recommendations (especially cold-start users), describe a reasonable strategy that balances relevance and exploration.
Be explicit about assumptions and failure modes.
Quick Answer: This question evaluates a data scientist's competency in experimental design, causal inference, metric selection and trade-offs, randomization and interference reasoning, rollout strategies, and observational methods for algorithm evaluation, including consideration of stakeholder impacts such as users, advertisers, and revenue.