Hashtag Recommendation Design (Short-Video App)
Task
Design a system to recommend hashtags a user is likely to follow. Answer all parts precisely.
-
Features/Signals
-
Enumerate at least 12 concrete features used to predict hashtag follow. For each feature, specify:
-
Data type: binary, categorical, or continuous
-
Time horizon: short-term vs. long-term
-
Normalization/bucketing strategy
-
Include the following classes of signals: recency-weighted views/likes/comments/saves, negative feedback, creator/user similarity, follow-graph features, session context (time of day, device), geographic and demographic signals, hashtag global/trending velocity, and safety indicators.
-
System Architecture
-
Propose an end-to-end system with candidate generation and ranking.
-
For ranking, compare: linear model, gradient-boosted trees, and wide-and-deep. Pick one and justify with latency (<50 ms P95 per request), memory, interpretability, and cold-start constraints.
-
Cold Start
-
For brand-new users and unseen hashtags, detail your approach (e.g., trending defaults stratified by region/gender, exploration via epsilon-greedy or Thompson sampling). Provide concrete parameter choices (e.g., epsilon values, priors).
-
Learning the Weights
-
Define the objective (e.g., cross-entropy or pairwise NDCG), regularization, debiasing for position/propensity, and calibration of scores to follow probability.
-
Metrics and Safety
-
Specify offline metrics (e.g., NDCG@k, MAP, calibration error).
-
Specify online guardrails that ensure violating/sensitive hashtags are never surfaced.