PracHub
QuestionsPremiumCoachesLearningGuidesInterview Prep
|Home/ML System Design/Tubitv

ML System Design: Thumbnail Selection for a Streaming Catalog

Last updated: Jun 24, 2026

Quick Overview

This ML system design question tests a candidate's ability to architect an end-to-end personalization and ranking system, covering problem framing, reward modeling, and low-latency serving at scale. It evaluates practical knowledge of contextual bandits, engagement-signal design, cold-start handling, and off-policy evaluation — core competencies for machine learning engineering roles.

  • medium
  • Tubitv
  • ML System Design
  • Machine Learning Engineer

ML System Design: Thumbnail Selection for a Streaming Catalog

Company: Tubitv

Role: Machine Learning Engineer

Category: ML System Design

Difficulty: medium

Interview Round: Technical Screen

# ML System Design: Thumbnail Selection for a Streaming Catalog You work at a video streaming service. For each title in the catalog (movie or show), there are multiple candidate thumbnail images — for example, frames automatically sampled from the video, plus a few editorially produced artwork variants. Given a title and its set of candidate thumbnails, design a machine learning system that selects which thumbnail to show each user so as to maximize engagement (the user clicking the title and starting to watch). Design the end-to-end system: how you frame the problem, what data and labels you use, the model, how you evaluate it offline and online, how you serve it at scale, and how you monitor it in production. Discuss whether and how you would personalize the choice per user versus picking one globally best thumbnail per title. ### Constraints & Assumptions - Catalog on the order of $10^5$ titles; each title has roughly 5-30 candidate thumbnails. - Tens of millions of users; the home/browse screen renders many titles per impression. - Thumbnail must be chosen at serve time within the page-render latency budget (single-digit to low tens of milliseconds for the ranking/selection step). - New titles and freshly generated candidate thumbnails appear continuously (cold start). - Primary engagement signal involves the click and subsequent viewing behavior; raw clicks alone may not fully capture genuine interest. ### Clarifying Questions to Ask - What engagement signal should the system optimize — and what are the risks of optimizing a coarser vs. a more nuanced signal? - Is the thumbnail choice per-user personalized, or should there be one globally winning thumbnail per title? What is the appetite for personalization complexity? - How are candidate thumbnails generated, and how many per title? Is there an editorial or brand constraint on which images are eligible? - What is the serving latency budget for the thumbnail decision within the page render? - How quickly must a brand-new title or a new candidate image start being shown well (cold-start expectations)? - Are there fairness or quality guardrails (e.g., no misleading frames, content-appropriateness)? ### What a Strong Answer Covers ```premium-lock What a Strong Answer Covers ``` ### Follow-up Questions - Your logged data only contains feedback for the thumbnail that was actually shown. How do you train or evaluate a model that wants to reason about thumbnails that were *never* shown for a given user? (Discuss exploration and off-policy/counterfactual evaluation.) - A naive click-maximizing model starts surfacing sensational, slightly misleading frames that get clicks but low completion. How do you detect this and change the objective to prevent it? - A brand-new title enters the catalog with five never-seen thumbnails and zero engagement data. Walk through exactly how the system behaves for the first hours and days. - How would you decide whether per-user personalization is actually worth the added complexity over a single global best thumbnail per title? What experiment would settle it?

Quick Answer: This ML system design question tests a candidate's ability to architect an end-to-end personalization and ranking system, covering problem framing, reward modeling, and low-latency serving at scale. It evaluates practical knowledge of contextual bandits, engagement-signal design, cold-start handling, and off-policy evaluation — core competencies for machine learning engineering roles.

Related Interview Questions

  • ML System Design: Movie Recommendation Model and Pipeline (AI-Assisted Round) - Tubitv (medium)
Tubitv logo
Tubitv
Feb 10, 2026, 12:00 AM
Machine Learning Engineer
Technical Screen
ML System Design
0
0

ML System Design: Thumbnail Selection for a Streaming Catalog

You work at a video streaming service. For each title in the catalog (movie or show), there are multiple candidate thumbnail images — for example, frames automatically sampled from the video, plus a few editorially produced artwork variants. Given a title and its set of candidate thumbnails, design a machine learning system that selects which thumbnail to show each user so as to maximize engagement (the user clicking the title and starting to watch).

Design the end-to-end system: how you frame the problem, what data and labels you use, the model, how you evaluate it offline and online, how you serve it at scale, and how you monitor it in production. Discuss whether and how you would personalize the choice per user versus picking one globally best thumbnail per title.

Constraints & Assumptions

  • Catalog on the order of 10510^5105 titles; each title has roughly 5-30 candidate thumbnails.
  • Tens of millions of users; the home/browse screen renders many titles per impression.
  • Thumbnail must be chosen at serve time within the page-render latency budget (single-digit to low tens of milliseconds for the ranking/selection step).
  • New titles and freshly generated candidate thumbnails appear continuously (cold start).
  • Primary engagement signal involves the click and subsequent viewing behavior; raw clicks alone may not fully capture genuine interest.

Clarifying Questions to Ask

  • What engagement signal should the system optimize — and what are the risks of optimizing a coarser vs. a more nuanced signal?
  • Is the thumbnail choice per-user personalized, or should there be one globally winning thumbnail per title? What is the appetite for personalization complexity?
  • How are candidate thumbnails generated, and how many per title? Is there an editorial or brand constraint on which images are eligible?
  • What is the serving latency budget for the thumbnail decision within the page render?
  • How quickly must a brand-new title or a new candidate image start being shown well (cold-start expectations)?
  • Are there fairness or quality guardrails (e.g., no misleading frames, content-appropriateness)?

What a Strong Answer Covers Premium

Follow-up Questions

  • Your logged data only contains feedback for the thumbnail that was actually shown. How do you train or evaluate a model that wants to reason about thumbnails that were never shown for a given user? (Discuss exploration and off-policy/counterfactual evaluation.)
  • A naive click-maximizing model starts surfacing sensational, slightly misleading frames that get clicks but low completion. How do you detect this and change the objective to prevent it?
  • A brand-new title enters the catalog with five never-seen thumbnails and zero engagement data. Walk through exactly how the system behaves for the first hours and days.
  • How would you decide whether per-user personalization is actually worth the added complexity over a single global best thumbnail per title? What experiment would settle it?

Solution

Show

Submit Your Answer to Earn 20XP

Sign in to leave a comment

Loading comments...

Browse More Questions

More ML System Design•More Tubitv•More Machine Learning Engineer•Tubitv Machine Learning Engineer•Tubitv ML System Design•Machine Learning Engineer ML System Design
PracHub

Master your tech interviews with 8,000+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.