Design a Warehouse for Key Metrics
Company: TikTok
Role: Data Engineer
Category: System Design
Difficulty: hard
Interview Round: Technical Screen
Given an e-commerce marketplace with buyers, sellers, orders, order_items, payments, and page_view events, design a data warehouse model to compute:
(
1) daily session-to-purchase conversion rate,
(
2) average order value,
(
3) 7-day buyer retention by signup cohort,
(
4) cancellation rate by seller, and
(
5) daily GMV by category. Specify the tables you would create (fact and dimension), the grain of each table, key columns (primary/foreign keys), important attributes and data types, partitioning and clustering strategy, and how you would handle late-arriving events, deduplication, null/anonymous users, and slowly changing seller attributes. Justify star vs. snowflake choices and any surrogate keys. No SQL is required.
Quick Answer: This question evaluates a data engineer's skills in dimensional modeling, data warehousing, and analytics infrastructure, covering facts and dimensions, grain definition, surrogate keys, partitioning and clustering strategies, and operational concerns like deduplication, late-arriving events, null/anonymous users, and slowly changing dimensions.