This question evaluates a data scientist's competency in experimental design, causal inference, metric selection and trade-offs, randomization and interference reasoning, rollout strategies, and observational methods for algorithm evaluation, including consideration of stakeholder impacts such as users, advertisers, and revenue.

You are a Product Analytics/Data Science partner for an ads ranking/recommendation team. Facebook has shipped (or plans to ship) a new ad recommendation algorithm and believes it is better.
Design an evaluation plan assuming you can run an experiment. In your answer:
Sometimes a 50/50 split is not appropriate (e.g., risk, capacity limits, model learning/feedback loops, or advertiser delivery constraints). Propose at least two valid alternatives for randomization or rollout (e.g., unequal allocation, phased ramp, cluster randomization, switchback, geo split), and explain:
If you are not allowed to run an online A/B test, how would you decide what to recommend to users and/or whether the new algorithm is better?
Be explicit about assumptions and failure modes.
Login required