Design hashtag recommender with cold start
Company: Meta
Role: Data Scientist
Category: Machine Learning
Difficulty: hard
Interview Round: Onsite
You’re designing hashtag recommendations for a short‑video app. Answer all parts precisely: 1) Enumerate at least 12 concrete signals/features to predict which hashtags a user will follow. For each, specify: data type (binary/categorical/continuous), time horizon (short vs long‑term), and normalization/bucketing. Include: recency‑weighted views/likes/comments/saves, negative feedback, creator/user similarity, follow‑graph features, session context (time of day, device), geographic and demographic signals, hashtag global/trending velocity, and safety indicators. 2) Propose an end‑to‑end system with candidate generation + ranking. For ranking, compare a linear model, gradient‑boosted trees, and a wide‑and‑deep model. Pick one and justify with latency (<50 ms P95 per request), memory, interpretability, and cold‑start constraints. 3) For brand‑new users and unseen hashtags, detail your cold‑start approach: e.g., trending defaults stratified by region/gender, plus exploration (epsilon‑greedy or Thompson sampling). Provide concrete parameter choices (e.g., epsilon value, priors). 4) Describe how you will learn weights (not by intuition): define the objective (e.g., cross‑entropy or pairwise NDCG), regularization, debiasing for position/propensity, and calibration of scores to follow probability. 5) Specify offline metrics (NDCG@k, MAP, calibration error) and online guardrails so that violating/sensitive hashtags are never surfaced.
Quick Answer: This question evaluates expertise in recommender-system design, feature engineering, ranking and learning-to-rank models, cold-start strategies, evaluation and debiasing techniques, and operational concerns like latency, memory, interpretability, and safety guardrails.