This question evaluates proficiency in online experiment design, A/B testing, metric selection and interpretation, statistical power and significance, heterogeneity analysis, and translating CTR lifts into quantified business impact.

You are evaluating a newly built ML ranking model for an ads recommendation surface. The goal is to determine whether the new model should replace the current system based on rigorous online experimentation.
Design an experiment to evaluate the new recommendation model. Specify:
Login required