Write SQL for top categories and highly active users
Company: Pinterest
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: easy
Interview Round: Technical Screen
You are given three tables:
### 1) `impression`
Event-level table of user impressions.
- `impression_id` BIGINT (PK)
- `user_id` BIGINT (FK → `user.user_id`)
- `pin_id` BIGINT (FK → `pin_info.pin_id`)
- `surface` VARCHAR (e.g., 'home', 'search', 'profile', etc.)
- `impression_ts` TIMESTAMP (when the impression happened)
- `country_code` VARCHAR (ISO country code; assume it is populated for the event)
### 2) `user`
- `user_id` BIGINT (PK)
- `created_ts` TIMESTAMP
### 3) `pin_info`
- `pin_id` BIGINT (PK)
- `category` VARCHAR
Assumptions:
- Use UTC for date boundaries.
- “Today” means `DATE(impression_ts) = CURRENT_DATE`.
---
## Task A
For **each country**, find the **pin category** with the **highest number of impressions today**.
Requirements:
- If there is a tie for the highest impression count, return **all tied categories** for that country.
- Output columns:
- `country_code`
- `category`
- `impression_cnt`
---
## Task B
For **each country**, compute the **percent of active users** who are **highly active**.
Definitions (within the **last 7 days**, inclusive of today):
- **Active user**: a user with **≥ 1 impression event**.
- **Highly active user**: a user who satisfies **both**:
1) Active on **≥ 4 distinct days** (based on `DATE(impression_ts)`), AND
2) On **at least one day**, used **≥ 3 distinct surfaces** (based on distinct `surface` values that day).
Requirements:
- Output columns:
- `country_code`
- `active_users`
- `highly_active_users`
- `pct_highly_active` (as a decimal or percent; specify which in your answer)
- Clearly handle divide-by-zero when a country has 0 active users in the window.
Quick Answer: This question evaluates proficiency in SQL-based data manipulation and analytics, including joins, aggregations, date/time handling, grouping, tie-handling for top-N results, and percentage calculations to produce per-country category rankings and user activity metrics.