How do I approach ML System Design interview questions?

ML System Design questions require understanding of core concepts and practice. PracHub provides solutions with explanations to help you master ml system design interviews.

What difficulty level is this interview question?

This is a medium difficulty ML System Design question, commonly asked during Technical Screen rounds at Tubitv.

What role is this question designed for?

This question is commonly asked for Machine Learning Engineer candidates at Tubitv during technical interviews.

ML System Design: Thumbnail Selection for a Streaming Catalog

Q: ML System Design: Thumbnail Selection for a Streaming Catalog

This ML system design question tests a candidate's ability to architect an end-to-end personalization and ranking system, covering problem framing, reward modeling, and low-latency serving at scale. It evaluates practical knowledge of contextual bandits, engagement-signal design, cold-start handling, and off-policy evaluation — core competencies for machine learning engineering roles.

ML System Design: Thumbnail Selection for a Streaming Catalog

You work at a video streaming service. For each title in the catalog (movie or show), there are multiple candidate thumbnail images — for example, frames automatically sampled from the video, plus a few editorially produced artwork variants. Given a title and its set of candidate thumbnails, design a machine learning system that selects which thumbnail to show each user so as to maximize engagement (the user clicking the title and starting to watch).

Design the end-to-end system: how you frame the problem, what data and labels you use, the model, how you evaluate it offline and online, how you serve it at scale, and how you monitor it in production. Discuss whether and how you would personalize the choice per user versus picking one globally best thumbnail per title.

Constraints & Assumptions

Catalog on the order of $10^5$ titles; each title has roughly 5-30 candidate thumbnails.
Tens of millions of users; the home/browse screen renders many titles per impression.
Thumbnail must be chosen at serve time within the page-render latency budget (single-digit to low tens of milliseconds for the ranking/selection step).
New titles and freshly generated candidate thumbnails appear continuously (cold start).
Primary engagement signal involves the click and subsequent viewing behavior; raw clicks alone may not fully capture genuine interest.

Clarifying Questions to Ask

What engagement signal should the system optimize — and what are the risks of optimizing a coarser vs. a more nuanced signal?
Is the thumbnail choice per-user personalized, or should there be one globally winning thumbnail per title? What is the appetite for personalization complexity?
How are candidate thumbnails generated, and how many per title? Is there an editorial or brand constraint on which images are eligible?
What is the serving latency budget for the thumbnail decision within the page render?
How quickly must a brand-new title or a new candidate image start being shown well (cold-start expectations)?
Are there fairness or quality guardrails (e.g., no misleading frames, content-appropriateness)?

What a Strong Answer Covers Premium

Follow-up Questions

Your logged data only contains feedback for the thumbnail that was actually shown. How do you train or evaluate a model that wants to reason about thumbnails that were never shown for a given user? (Discuss exploration and off-policy/counterfactual evaluation.)
A naive click-maximizing model starts surfacing sensational, slightly misleading frames that get clicks but low completion. How do you detect this and change the objective to prevent it?
A brand-new title enters the catalog with five never-seen thumbnails and zero engagement data. Walk through exactly how the system behaves for the first hours and days.
How would you decide whether per-user personalization is actually worth the added complexity over a single global best thumbnail per title? What experiment would settle it?

ML System Design: Thumbnail Selection for a Streaming Catalog

Constraints & Assumptions

Catalog on the order of $10^5$ titles; each title has roughly 5-30 candidate thumbnails.
Tens of millions of users; the home/browse screen renders many titles per impression.
Thumbnail must be chosen at serve time within the page-render latency budget (single-digit to low tens of milliseconds for the ranking/selection step).
New titles and freshly generated candidate thumbnails appear continuously (cold start).
Primary engagement signal involves the click and subsequent viewing behavior; raw clicks alone may not fully capture genuine interest.

Clarifying Questions to Ask

What engagement signal should the system optimize — and what are the risks of optimizing a coarser vs. a more nuanced signal?
Is the thumbnail choice per-user personalized, or should there be one globally winning thumbnail per title? What is the appetite for personalization complexity?
How are candidate thumbnails generated, and how many per title? Is there an editorial or brand constraint on which images are eligible?
What is the serving latency budget for the thumbnail decision within the page render?
How quickly must a brand-new title or a new candidate image start being shown well (cold-start expectations)?
Are there fairness or quality guardrails (e.g., no misleading frames, content-appropriateness)?

What a Strong Answer Covers Premium

Follow-up Questions

Your logged data only contains feedback for the thumbnail that was actually shown. How do you train or evaluate a model that wants to reason about thumbnails that were never shown for a given user? (Discuss exploration and off-policy/counterfactual evaluation.)
A naive click-maximizing model starts surfacing sensational, slightly misleading frames that get clicks but low completion. How do you detect this and change the objective to prevent it?
A brand-new title enters the catalog with five never-seen thumbnails and zero engagement data. Walk through exactly how the system behaves for the first hours and days.
How would you decide whether per-user personalization is actually worth the added complexity over a single global best thumbnail per title? What experiment would settle it?

ML System Design: Thumbnail Selection for a Streaming Catalog

Quick Overview

ML System Design: Thumbnail Selection for a Streaming Catalog

Constraints & Assumptions

Clarifying Questions to Ask

What a Strong Answer Covers Premium

Follow-up Questions

Solution

Submit Your Answer to Earn 20XP

ML System Design: Thumbnail Selection for a Streaming Catalog

Quick Overview

ML System Design: Thumbnail Selection for a Streaming Catalog

Constraints & Assumptions

Clarifying Questions to Ask

What a Strong Answer Covers Premium

Follow-up Questions

Solution

Submit Your Answer to Earn 20XP