PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/Machine Learning/Meta

Design a hashtag recommender for News Feed

Last updated: Mar 29, 2026

Quick Overview

This question evaluates a candidate's competency in end-to-end machine learning system design for recommender and ranking problems, covering problem framing, candidate generation, feature engineering, model training and calibration, offline and online evaluation, cold-start strategies, safety/policy considerations, and interpretability.

  • hard
  • Meta
  • Machine Learning
  • Data Scientist

Design a hashtag recommender for News Feed

Company: Meta

Role: Data Scientist

Category: Machine Learning

Difficulty: hard

Interview Round: Technical Screen

Context: You’re adding hashtag recommendations alongside posts in a large social app’s News Feed to increase useful engagement without harming core feed health. Design the system end‑to‑end. Be precise and justify trade‑offs. Address the following: - Problem framing: What is the exact prediction target Y and unit of observation (e.g., user–post–hashtag impression within 24h)? Define your positive/negative labels from logs and how you will construct negatives (e.g., downsampled unclicked exposures) while avoiding selection bias. - Candidate generation: List at least three complementary candidate sources (e.g., personalized affinity, content-based from post text/media, trending/recency), and how you’ll cap or diversify candidates to avoid popularity bias. - Features: Specify at least 10 concrete features you would feed into a logistic regression ranker, covering user–hashtag affinity, post–hashtag semantic relevance, temporal/popularity signals, and platform/locale. For each, explain expected sign/shape and how you’ll bucket/normalize it. - Model and training: Explain why a calibrated logistic regression is a good starting point versus deeper models. Detail regularization (L1/L2), handling class imbalance, negative sampling ratio, time-based splits, and leakage prevention (e.g., exclude post-publication features, use pre-impression snapshots). - Calibration and thresholds: How will you check and correct calibration (e.g., Platt/Isotonic) and choose display thresholds per user cohort or surface? How will you set exploration rate for new hashtags? - Offline evaluation: Define primary metrics (e.g., log loss, AUC‑PR), calibration plots, and counterfactual estimation for top‑k ranking (e.g., IPS/propensity weighting) to mitigate position bias from historical data. - Online experimentation: Specify randomization unit, guardrails (feed time, session exits, complaint rate), primary outcomes (hashtag CTR, downstream dwell, creator engagement), novelty‑effect detection, ramp plan, and stopping criteria. - Cold start and freshness: Propose strategies for unseen hashtags/users and concept drift detection; include decay factors and automated retires for stale tags. - Safety and policy: Identify risks (e.g., sensitive or crisis‑related tags, misinformation) and propose real‑time blocks/filters and fairness checks across languages/regions. - Interpretability: Describe how you’d translate key logistic regression coefficients into actionable product insights (e.g., diminishing returns of showing >2 tags, language mismatch penalties).

Quick Answer: This question evaluates a candidate's competency in end-to-end machine learning system design for recommender and ranking problems, covering problem framing, candidate generation, feature engineering, model training and calibration, offline and online evaluation, cold-start strategies, safety/policy considerations, and interpretability.

Related Interview Questions

  • Implement 1NN Embeddings and Forward Pass - Meta (hard)
  • Design and evaluate an ads ranking algorithm - Meta (easy)
  • How would you design a Shop Ads ranking algorithm? - Meta (easy)
  • Derive Linear Regression Solution - Meta (medium)
  • Explain key ML metrics and techniques - Meta (medium)
Meta logo
Meta
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Machine Learning
2
0

Design: Hashtag Recommendations in the News Feed

Context

You are adding hashtag recommendations alongside posts in a large social app’s News Feed. The goal is to increase useful engagement (e.g., hashtag taps and downstream value) without harming core feed health. Design the system end‑to‑end, be precise, and justify trade‑offs.

Tasks

  1. Problem framing
    • Define the exact prediction target Y and the unit of observation (e.g., user–post–hashtag impression within a time window).
    • Specify how positives/negatives are labeled from logs.
    • Explain how you will construct additional negatives (e.g., downsampled unclicked exposures or unexposed candidates) while avoiding selection bias.
  2. Candidate generation
    • Propose at least three complementary sources (e.g., personalized affinity, content‑based from post text/media, trending/recency).
    • Explain caps/diversification to avoid popularity bias and ensure coverage.
  3. Features for a logistic‑regression ranker
    • List at least 10 concrete features spanning: user–hashtag affinity, post–hashtag semantic relevance, temporal/popularity signals, and platform/locale.
    • For each feature, describe expected sign/shape and how you will bucket/normalize it.
  4. Model and training
    • Justify starting with a calibrated logistic regression versus deeper models.
    • Detail regularization (L1/L2), handling class imbalance, negative sampling ratio/weights, time‑based splits, and leakage prevention (e.g., pre‑impression feature snapshots, exclude post‑publication features).
  5. Calibration and thresholds
    • Describe how you will check and correct calibration (e.g., Platt or isotonic) and set display thresholds by user cohort/surface.
    • Propose an exploration strategy/rate for new hashtags.
  6. Offline evaluation
    • Define primary metrics (e.g., log loss, AUC‑PR), calibration plots.
    • Describe counterfactual estimation for top‑k ranking (e.g., IPS/propensity weighting) to mitigate position bias from historical data.
  7. Online experimentation
    • Specify randomization unit, guardrails (e.g., feed time, session exits, complaint rate), primary outcomes (hashtag CTR, downstream dwell, creator engagement), novelty‑effect detection, ramp plan, and stopping criteria.
  8. Cold start and freshness
    • Strategies for unseen hashtags/users and concept drift detection; include decay factors and automated retires for stale tags.
  9. Safety and policy
    • Identify risks (e.g., sensitive or crisis‑related tags, misinformation) and propose real‑time blocks/filters and fairness checks across languages/regions.
  10. Interpretability
  • Explain how to translate key logistic‑regression coefficients into actionable product insights (e.g., diminishing returns of showing >2 tags, language mismatch penalties).

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More Machine Learning•More Meta•More Data Scientist•Meta Data Scientist•Meta Machine Learning•Data Scientist Machine Learning
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.