PracHub
QuestionsPremiumLearningGuidesCheatsheetNEWCoaches
|Home/Data Manipulation (SQL/Python)/Google

Design a scalable video platform database

Last updated: Mar 29, 2026

Quick Overview

This question evaluates relational database design and data engineering competencies—including schema modeling, many-to-many relationships, idempotent ingest, indexing and partitioning, OLTP versus analytics integration, GDPR-compliant deletion strategies, and query formulation—within the Data Manipulation (SQL/Python) domain for a Data Scientist role. It is commonly asked to assess both conceptual understanding and practical application of scalable data architectures, performance tuning, and compliance trade-offs, focusing on the ability to reason about schema choices, read/write optimization, and analytics integration without implementation details.

  • Medium
  • Google
  • Data Manipulation (SQL/Python)
  • Data Scientist

Design a scalable video platform database

Company: Google

Role: Data Scientist

Category: Data Manipulation (SQL/Python)

Difficulty: Medium

Interview Round: Technical Screen

Design the relational database for a YouTube-like video company. Deliverables: 1) list the core tables with key columns, types, and constraints (users, channels, videos, video_transcodes/qualities, captions, tags, video_tags, views, likes, comments, subscriptions, playlists, playlist_videos, ad_impressions, daily_video_metrics); 2) define primary/foreign keys, uniqueness, and soft-delete and GDPR-compliant deletion strategies; 3) model many-to-many relationships (e.g., videos↔tags, playlists↔videos) and idempotent ingest (avoid duplicate views/likes); 4) include indexing/partitioning (e.g., views partitioned by event_date, video_id; clustered indexes for hot queries), and how you’d support both OLTP and analytics (star schema or read-optimized warehouse tables) without blocking writes; 5) show sample CREATE TABLE DDL for 3–4 critical tables (videos, views, comments, ad_impressions) and explain how you’d query: a) watch-time per video per day, b) top N videos by unique viewers in the last 7 days, c) comments pagination with anti-abuse flags; 6) describe how you’d store multiple renditions (1080p, 4K, HDR) and A/B test assignments for thumbnails.

Quick Answer: This question evaluates relational database design and data engineering competencies—including schema modeling, many-to-many relationships, idempotent ingest, indexing and partitioning, OLTP versus analytics integration, GDPR-compliant deletion strategies, and query formulation—within the Data Manipulation (SQL/Python) domain for a Data Scientist role. It is commonly asked to assess both conceptual understanding and practical application of scalable data architectures, performance tuning, and compliance trade-offs, focusing on the ability to reason about schema choices, read/write optimization, and analytics integration without implementation details.

Related Interview Questions

  • Generate binomial matrix and column-normalize - Google (Medium)
  • Analyze video flags and reviews with SQL - Google (Medium)
  • Write SQL/Python for messy event data - Google (Medium)
  • Add a conditional column in Python - Google (Medium)
  • Find most co‑purchased product pairs in SQL - Google (Medium)
Google logo
Google
Oct 13, 2025, 9:49 PM
Data Scientist
Technical Screen
Data Manipulation (SQL/Python)
5
0

Design the relational database for a YouTube-like video company. Deliverables: 1) list the core tables with key columns, types, and constraints (users, channels, videos, video_transcodes/qualities, captions, tags, video_tags, views, likes, comments, subscriptions, playlists, playlist_videos, ad_impressions, daily_video_metrics); 2) define primary/foreign keys, uniqueness, and soft-delete and GDPR-compliant deletion strategies; 3) model many-to-many relationships (e.g., videos↔tags, playlists↔videos) and idempotent ingest (avoid duplicate views/likes); 4) include indexing/partitioning (e.g., views partitioned by event_date, video_id; clustered indexes for hot queries), and how you’d support both OLTP and analytics (star schema or read-optimized warehouse tables) without blocking writes; 5) show sample CREATE TABLE DDL for 3–4 critical tables (videos, views, comments, ad_impressions) and explain how you’d query: a) watch-time per video per day, b) top N videos by unique viewers in the last 7 days, c) comments pagination with anti-abuse flags; 6) describe how you’d store multiple renditions (1080p, 4K, HDR) and A/B test assignments for thumbnails.

Comments (0)

Sign in to leave a comment

Loading comments...

Browse More Questions

More Data Manipulation (SQL/Python)•More Google•More Data Scientist•Google Data Scientist•Google Data Manipulation (SQL/Python)•Data Scientist Data Manipulation (SQL/Python)
PracHub

Master your tech interviews with 7,500+ real questions from top companies.

Product

  • Questions
  • Learning Tracks
  • Interview Guides
  • Resources
  • Premium
  • For Universities
  • Student Access

Browse

  • By Company
  • By Role
  • By Category
  • Topic Hubs
  • SQL Questions
  • Compare Platforms
  • Discord Community

Support

  • support@prachub.com
  • (916) 541-4762

Legal

  • Privacy Policy
  • Terms of Service
  • About Us

© 2026 PracHub. All rights reserved.