Write SQL for TikTok Live creator metrics
Company: TikTok
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: easy
Interview Round: Technical Screen
You are analyzing TikTok Live sessions and their engagement.
## Tables
### `live_room`
Each row is a Live session (“room”) launched by a creator.
| column | type | description |
|---|---|---|
| `date` | DATE | Calendar date of the Live session (assume UTC). |
| `creator_id` | BIGINT | Creator who launched the Live. |
| `room_id` | BIGINT | Unique id of the Live room. |
| `country` | STRING | Country where the Live room is attributed (e.g., `US`, `ID`, `UK`). |
| `live_type` | STRING | Device type used to go live (e.g., `mobile`, `pc`). |
**Assume PK:** (`date`, `room_id`).
### `live_room_activity`
Each row contains aggregated engagement for a room on that date.
| column | type | description |
|---|---|---|
| `date` | DATE | Same date as the Live session. |
| `room_id` | BIGINT | Live room id. |
| `watch_cnt` | INT | Number of viewers who watched that room. |
| `gift_recieved_cnt` | INT | Number of gifts received in that room. |
**Assume PK:** (`date`, `room_id`).
### Join relationship
Join the tables on **(`date`, `room_id`)**. If a room has no activity row, assume `watch_cnt = 0` and `gift_recieved_cnt = 0`.
---
## Questions (write SQL only)
### Q1
For the period **2022-10-01 to 2022-10-07** (inclusive), in the **US**, find **how many creators launched more than 3 `mobile` Live sessions**.
**Output:**
- `creator_count` (number of distinct `creator_id` meeting the condition)
### Q2 (Medium)
In **October 2022**, find the **top 5 creators by total gifts received** for **each country**.
Notes:
- Total gifts received for a creator in a country = sum of `gift_recieved_cnt` across all that creator’s rooms in that country during October 2022.
- If ties occur at rank 5, break ties by `creator_id` ascending (or specify a deterministic tie-break).
**Output (recommended):**
- `country`
- `creator_id`
- `total_gifts_received`
- `rank_in_country` (1–5)
### Q3 (Hard)
For **yesterday’s date** (relative to query run date), calculate the **count of creators** who satisfy **both**:
1) The creator went live **more than once** that day (i.e., launched **at least 2 rooms**).
2) **Every** Live room they launched that day had **`watch_cnt >= 500`**.
**Output:**
- `qualified_creator_count`
Quick Answer: This question evaluates SQL data-manipulation competencies such as joins, aggregations, group-by and window functions, handling missing activity rows, date-range filtering, country partitioning, and deterministic tie-breaking for top-k metrics.