Design ML ranking for query suggestions
Company: Etsy
Role: Data Scientist
Category: Machine Learning
Difficulty: hard
Interview Round: Technical Screen
Given query/click logs with fields {user_id, timestamp, locale, device, typed_prefix, suggested_term, position, clicked, dwell_time, downstream_query, eventual_success}, design an ML system to re-rank candidate suggestions for each prefix. Specify: (1) label(s) that best reflect long-term user success (e.g., success within session vs. click) and how to create time-respecting train/validation splits to avoid leakage; (2) how you will correct position/selection bias (e.g., counterfactual logging, inverse propensity weighting, randomized interleaving); (3) feature sets (contextual, lexical, popularity time series, embeddings/LM semantics) and how to handle multilingual text and Unicode normalization; (4) model class and serving constraints (latency/memory) and a fallback for cold-start terms/users; (5) strategies to limit feedback loops, drift, and unsafe/low-quality suggestions; and (6) an offline/online evaluation plan with rollback criteria.
Quick Answer: This question evaluates a candidate's competency in designing a production-grade machine learning ranking system for re-ranking autocomplete suggestions, encompassing label definition for long-term success, counterfactual bias correction, feature engineering (including multilingual and Unicode handling), model selection, serving constraints, and evaluation strategies. It is commonly asked in the Machine Learning domain to assess practical application-level understanding of offline/online evaluation, bias mitigation, latency and memory trade-offs, and robustness to feedback loops and distributional drift.