How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

What difficulty level is this interview question?

This is a medium difficulty ML System Design question, commonly asked during Technical Screen rounds at Bytedance.

What role is this question designed for?

This question is commonly asked for Software Engineer candidates at Bytedance during technical interviews.

Design a Personalized Content Recommendation Engine

Q: Design a Personalized Content Recommendation Engine

This question evaluates a candidate's ability to design a large-scale personalized recommendation system, covering problem framing, candidate retrieval, ranking, and serving under tight latency constraints. It tests knowledge of machine learning system design, including multi-objective modeling, cold-start handling, and production evaluation, at a practical, applied level typical of ML system design interviews.

Design a Personalized Content Recommendation Engine

You are asked to design the recommendation engine that powers the personalized home feed of a large content-sharing platform (think a short-video or article feed). When a user opens the app, the system must return an ordered list of items the user is most likely to engage with, and continuously refresh that list as the user scrolls.

Design this recommendation engine end to end: how you frame it as a machine-learning problem, how you generate and rank candidates from a huge catalog, how you serve recommendations at low latency and high scale, and how you evaluate and monitor the system in production.

Constraints & Assumptions

Catalog has 100M+ items and grows continuously — new items are uploaded every second.
~100M daily active users; a session loads ~10-20 items at a time and may scroll through hundreds.
A feed request must return within roughly 100-200 ms at p99.
Strong long-tail and cold-start pressure: brand-new items and brand-new users appear constantly.
Available engagement signals: impressions, plays/clicks, watch time, likes, shares, follows, and skips.
The business goal is long-term engagement / retention, not just maximizing the next click.

Clarifying Questions to Ask

What is the primary business objective — day-N retention, total watch time, ad revenue, creator growth — and how do we trade these off against each other?
What item types are in scope (videos, posts, ads, who-to-follow), and is this a single blended feed or several separate rails?
What is the operating scale (catalog size, DAU, peak QPS) and the hard latency budget per feed request?
Which signals and logs are available, and how fresh are they (real-time watch-time events vs daily batch)?
Are there hard constraints to honor — content-safety / policy filtering, diversity, freshness, or fairness to creators?
How is the feed consumed (infinite scroll vs paginated) and how often must it be recomputed within a session?

Part 1 — Problem framing and objective

Frame the recommendation task as a machine-learning problem. What exactly are you predicting, what is the label, and how do you translate the business goal into a trainable objective?

What This Part Should Cover Premium

Part 2 — Candidate generation (retrieval)

With 100M+ items you cannot score the whole catalog on every request. Design the candidate-generation stage that narrows the catalog down to a few hundred candidates per request.

What This Part Should Cover Premium

Part 3 — Ranking

Given the few hundred retrieved candidates, design the ranking model that orders them for this specific user in this context.

What This Part Should Cover Premium

Part 4 — Serving, scale, and freshness

Describe the online serving architecture that returns a ranked feed within the latency budget at this scale, and explain how features and models stay fresh.

What This Part Should Cover Premium

Part 5 — Evaluation, experimentation, and monitoring

How do you know the system is good, and how do you safely ship changes to it?

What This Part Should Cover Premium

What a Strong Answer Covers Premium

Follow-up Questions

How would you serve a brand-new user with zero history on their very first session, and how does the experience evolve over their first few interactions?
Retrieval and ranking disagree often (good candidates rank low, or weak candidates dominate). How do you debug whether the bottleneck is retrieval or ranking?
Engagement is up but long-term retention is flat or declining. How do you detect and fix a feedback loop that is over-promoting clickbait?
How would you introduce a new objective — say creator-side fairness or content diversity — without retraining the entire stack from scratch?

Design a Personalized Content Recommendation Engine

Constraints & Assumptions

Catalog has 100M+ items and grows continuously — new items are uploaded every second.
~100M daily active users; a session loads ~10-20 items at a time and may scroll through hundreds.
A feed request must return within roughly 100-200 ms at p99.
Strong long-tail and cold-start pressure: brand-new items and brand-new users appear constantly.
Available engagement signals: impressions, plays/clicks, watch time, likes, shares, follows, and skips.
The business goal is long-term engagement / retention, not just maximizing the next click.

Clarifying Questions to Ask

What is the primary business objective — day-N retention, total watch time, ad revenue, creator growth — and how do we trade these off against each other?
What item types are in scope (videos, posts, ads, who-to-follow), and is this a single blended feed or several separate rails?
What is the operating scale (catalog size, DAU, peak QPS) and the hard latency budget per feed request?
Which signals and logs are available, and how fresh are they (real-time watch-time events vs daily batch)?
Are there hard constraints to honor — content-safety / policy filtering, diversity, freshness, or fairness to creators?
How is the feed consumed (infinite scroll vs paginated) and how often must it be recomputed within a session?

Part 1 — Problem framing and objective

Frame the recommendation task as a machine-learning problem. What exactly are you predicting, what is the label, and how do you translate the business goal into a trainable objective?

What This Part Should Cover Premium

Part 2 — Candidate generation (retrieval)

With 100M+ items you cannot score the whole catalog on every request. Design the candidate-generation stage that narrows the catalog down to a few hundred candidates per request.

What This Part Should Cover Premium

Part 3 — Ranking

Given the few hundred retrieved candidates, design the ranking model that orders them for this specific user in this context.

What This Part Should Cover Premium

Part 4 — Serving, scale, and freshness

Describe the online serving architecture that returns a ranked feed within the latency budget at this scale, and explain how features and models stay fresh.

What This Part Should Cover Premium

Part 5 — Evaluation, experimentation, and monitoring

How do you know the system is good, and how do you safely ship changes to it?

What This Part Should Cover Premium

What a Strong Answer Covers Premium

Follow-up Questions

How would you serve a brand-new user with zero history on their very first session, and how does the experience evolve over their first few interactions?
Retrieval and ranking disagree often (good candidates rank low, or weak candidates dominate). How do you debug whether the bottleneck is retrieval or ranking?
Engagement is up but long-term retention is flat or declining. How do you detect and fix a feedback loop that is over-promoting clickbait?
How would you introduce a new objective — say creator-side fairness or content diversity — without retraining the entire stack from scratch?

Design a Personalized Content Recommendation Engine

Quick Overview

Design a Personalized Content Recommendation Engine

Design a Personalized Content Recommendation Engine

Constraints & Assumptions

Clarifying Questions to Ask

Part 1 — Problem framing and objective

What This Part Should Cover Premium

Part 2 — Candidate generation (retrieval)

What This Part Should Cover Premium

Part 3 — Ranking

What This Part Should Cover Premium

Part 4 — Serving, scale, and freshness

What This Part Should Cover Premium

Part 5 — Evaluation, experimentation, and monitoring

What This Part Should Cover Premium

What a Strong Answer Covers Premium

Follow-up Questions

Submit Your Answer to Earn 20XP

Design a Personalized Content Recommendation Engine

Quick Overview

Design a Personalized Content Recommendation Engine

Design a Personalized Content Recommendation Engine

Constraints & Assumptions

Clarifying Questions to Ask

Part 1 — Problem framing and objective

What This Part Should Cover Premium

Part 2 — Candidate generation (retrieval)

What This Part Should Cover Premium

Part 3 — Ranking

What This Part Should Cover Premium

Part 4 — Serving, scale, and freshness

What This Part Should Cover Premium

Part 5 — Evaluation, experimentation, and monitoring

What This Part Should Cover Premium

What a Strong Answer Covers Premium

Follow-up Questions

Submit Your Answer to Earn 20XP