Design long-tail search evaluation under label budget
Company: Google
Role: Data Scientist
Category: Analytics & Experimentation
Difficulty: hard
Interview Round: Technical Screen
Quick Answer: This question evaluates a data scientist's competencies in experimental design and analytics, including stratified sampling and Neyman allocation, importance-weighted estimators, counterfactual click-based evaluation with propensity-aware methods, variance estimation and power analysis, and active-learning plus drift-monitoring strategies for long-tail query evaluation. Commonly asked in the Analytics & Experimentation domain to verify the ability to build statistically rigorous, budget-constrained evaluation pipelines that combine limited human labels with logged click data, it tests both conceptual understanding of sampling, causal-identification and statistical guarantees and practical application in variance computation, power calculations, and operational monitoring.