Compute multi-account user distribution and unread pct
Company: Meta
Role: Data Scientist
Category: Data Manipulation (SQL/Python)
Difficulty: easy
Interview Round: Technical Screen
You are working on a product where a **user** can have multiple **accounts**, and each account can receive **notifications**.
### Tables
Assume the following schemas:
**users**
- `user_id` BIGINT PRIMARY KEY
- `created_at` TIMESTAMP
**accounts**
- `account_id` BIGINT PRIMARY KEY
- `user_id` BIGINT NOT NULL REFERENCES `users(user_id)`
- `created_at` TIMESTAMP
**notifications**
- `notification_id` BIGINT PRIMARY KEY
- `account_id` BIGINT NOT NULL REFERENCES `accounts(account_id)`
- `created_at` TIMESTAMP
- `is_read` BOOLEAN -- `FALSE` means unread
### Tasks
1) **Account-count distribution**: Return the number of users who have:
- exactly **2** accounts
- exactly **3** accounts
- **4 or more** accounts
**Required output columns**: `account_bucket`, `num_users`.
2) **Unread-notification rate among multi-account users**: Among users with **2+ accounts**, compute the **percentage of users** who have **at least one unread notification** across any of their accounts.
**Required output columns**: `pct_users_with_unread` (as a percent or decimal; specify which you choose).
Quick Answer: This question evaluates a data scientist's ability to perform relational data manipulation and aggregation with SQL or Python, covering competencies such as handling one-to-many joins, grouping and bucketing counts, deduplicating across related tables, and computing cohort-level percentages from boolean flags.